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PROMQTFR 



PCT/NO94/00190 



The present invention relates to a promoter and to a conjugate comprising the same. The 
present invention also relates to the use of the promoter for stage- and tissue- specific 
expression of a gene of interest (GOI). The present invention also relates to the genomic 
nucleotide sequence of, and isolation of, the promoter. 

In particular the present invention relates to a promoter for a lipid transfer protein (Up) gwie 
known as the Ltp2 grae. The present invention also relates to the application of this Up! 
gene promoter to express a GOI specifically in the aleurone layer of a monocotyledon - 
especially a transgenic cereal seed - more especially a developing transgenic cereal seed. 

A mature cereal seed contains two distinct organs: the embryo - which gives rise to the 
vegetative plant - and the endosperm - which supports tiie growth of the emerging seedling 
during a short period of time after germination. The endosperm, which is the site of 
deposition of different storage products such as starch and proteins, is further sub-divisible 
into a peripheral layer of living aleurone cells surrounding a central mass of non-living 
starchy endosperm cells. 

The aleurone cells differentiate from primary endosperm cells early during seed development 
or between 10 to 21 days after fertilization. The aleurone layer and embryo share many 
similarities in their gene expression programmes. They are the only cereal seed tissues that 
survive the desiccation process during seed maturation and they both have active gene 
transcription during seed geimination. 

The aleurone layer of cereal seeds comprises specialized cells that surround the central 
starchy endosperm, i.e. the site for starch and protein accumulation in the developing seed 
(Bosnes et al. , 1992, Olsen et al. , 1992). During seed germination, the cells of the aleurone 
layer produce amylolytic and proteolytic enzymes that degrade the storage compounds into 
metabolites that are taten up and are used by the growing embryo. Two aspects of aleurone 
cell biology that have been intensively studied are the genetics of anthocyanin pigmratation 
of aleurone cells in maize ^cClintock, 1987) and the hormonal regulation of gene 
transcription in the aleurone layer of germinating barley seeds (Fincher, 1989). 
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Using transposon tagging, several structural and regulatory genes in the anthcKyanin synthesis 
pathway have been isolated and characterized (Paz-Ares et al, 1987; Dellaporta et al, 
1988). In barley, alpha-amylase and beta-glucanase genes that are expressed both in the 
aleurone layer and embryos of mature germinating seeds have been identified (Karrer et aL , 
1991; Slakeski and Fincher, 1992). In addition, two other cDNAs repiesmting transcripts 
that are differentially expressed in the aleurone layers of developing barley grains have been 
isolated. These are CHI26 (Lea et 12/., 1991) and pZE40 (Smith et aL, 1992). For none of 
these gene products has it been shown in transgenic cereal plants that the promoter directs 
expression in just the aleurone layer of developing grains. 

Non-specific lipid transfer proteins (nsLtp*s) have the ability to mediate in vitro transfer of 
radiolabelled phospholipids from liposomal donor membranes to mitochondrial acoq>tor 
membranes (Kader et aL, 1984; Watanabe and Yamada, 1986). Although their in vivo 
function remains unclear, nsLTPs from plants have recently received much attention due to 
their recurrent isolation as cDNA clones representing developmentally regulated transcripts 
expressed in several different tissues. A common feature is that, at some point in 
development, they are highly expressed in tissues producing an extracellular layer rich in 
lipids. Thus, transcripts corresponding to cDNAs mcoding 10 kDa nsLTPs have been 
characterized in the tapetum cells of anthers as well as the epidermal layers of leafs and 
shoots in tobacco (Koltunow et aL, 1990; Fleming et al., 1992), and barley aleurone layers 
(Mundy and Rogers, 1986; Jakobsen et al., 1989). 

In addition, a 10 kDa nsLTP was discovered to be one of the proteins secreted from auxin- 
treated somatic carrot embryos into the tissue culture medium (Sterk et aL, 1991). Based 
on in situ data demonstrating that the Ltp transcripts are localized in the protoderm cells of 
the somatic and zygotic carrot embryo and in the q>ithelial layer of the maize embryonic 
scutellum, it was suggested that in vivo nsLTPs are involved in either cutin biosynthesis or 
in the biogenesis and degradation of storage lipids (Sossountzov et aL, 1991; Sterk et al., 
1991). 

A nsLTP in Arabidopsis has been localized to the cell walls lending further support to an 
extracellular function if this class of proteins (Thoma et aL, 1993). 
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PCT WO 90/01551 mentions the use of the aleurone cells of mature, germinating seeds to 
produce proteins from GOIs under the control of an alpha-amylase promoter. This promoter 
is active only in geminating seeds. 

Recendy, using a standard in vitro Up assay, two 10 kDa and one member of a hovel class 
of 7 kDa nsL^'s were isolated from wheat seeds (Monnet, 1990; Di^ck et aL, 1992). The 
sequence of this 7 kDa wheat nsLtp protein shows a high degree of similarity with the 
predicted protdn from the open reading frame (ORF) of the BzllE cDNA, which had been 
isolated in a differential screening for barley aleurone specific transcripts (Jakobsen et aL, 
1989), However, the amino acid sequaice of tiiis polypq>tide showed only limited sequence 
identities with the previously sequenced 10 IdJa proteins. In sub-cellular localisation studies 
using gold labelled antibodies one 10 kDa protein from Arabidopsis was localised to the cell 
wall of epidermal leaf cells. The presence of a signal peptide domain in the N-terminus of 
the open reading frames of all characterised plant ns-LTP cDNAs, also suggests that these 
are proteins destined for the secretory pathway with a possible extracellular function. 

Olsen et al. in a paper tided "Molecular Strat^es For Improving Pre-Harvest Sprouting 
Resistance In Cereals" published in 1990 in the published extracts from Uie Fiftii 
International Symposium On Pre-Harvest Sprouting In Cereals (Westview Press Inc.) 
describe three diffwent strategies for expressing different "effector" genes in the aleurone 
layer and the scutellum in developing grains of transgenic plants. This document mentions 
4 promoter systems - including a system called BllE (which is now recognised as being the 
same as the Ltp2 gene promoter). There is no sequence listing for BllE given in this 
document. 

Kalla et al (1993) in a paper titled "Characterisation of Promoter Elements Of Aleurone 
Specific Genes From Barley" describe the possibility of the expression of anti-sense genes 
by the use of promoters of the aleurone genes B22E, B23D, B14D, and Bl IE (which is now 
shown to be the same as Ltp2). 

The Kalla et al. (1993) paper gives a very general map of the Ltp2 gene promoter. The 
transient expression results showed very low levels of expression of the reporter gene. 
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A sequence listing of the Ltp2 gene was available as of 23 December 1992 on the EMBL 
database. 

One of the major limitations to the molecular breeding of new varieties of crop plants with 
aleurone cells expressing GOIs is the lack of a suitable aleurone specific promoter* 

At present, the available promoters - such as the CaMV 3SS, rice actin and maize alcohol 
dehydrogenase - all are constitutive. In this regard, they are non-specific in target site or 
stage development as they drive expression in most cell types in the plants. 

Another problem is how to achieve expression of a product coded for by a GOI in the 
aleurone layer of the endosperm that gives minimal interference with the develcyping embryo 
and seedling. 

It is therefore desirable to provide aleurone specific expression of GOIs in cereal such as 
rice, maize, wheat, barley and other transgenic cereal plants. 

Moreover it is desirable to provide aleurone specific expression that does not lead to the 
detriment of the developing embryo and seedling; 

According to a first aspect of the present invention there is provided a Ltp2 gene promoter 
comprising: 

the sequence shown as SEQ. I.D. 1, or 

a sequence that has substantial homology with that of SEQ. I.D. 1, or 
a variant thereof. 

According to a second aspect of the present invention there is provided a conjugate 
comprising a GOI and a lAp2 gene promoter as just defined. 
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According to a third aspect of the present invention there is provided an in vivo expression 
system comprising a conjugate comprising a GOI and a Ltp2 gene promoter as just defined 
wherein the conjugate is integrated, preferably stably integrated, within a monocotyledon's 
(preferably a cereal's) genomic DNA. 

According to a fourth aspect of the present invention there is provided a transgenic cereal 
comprising a conjugate comprising a GOI and a Ltp2 gene promoter as just defined wherein 
the conjugate is integrated, preferably stably integrated, within a cereal's genomic DNA* 

According to a fifth aspect of the present invwition there is provided the in vivo expression 
in the aleurone cells of a monocotyledon ^refi^^ly a cereal) of a conjugate comprising a 
GOI and a Ltp2 gene promoter as just defined; whmin the conjugate is integrated, 
preferably stably integrated, within the monocotyledon's graomic DNA. 

According to a sixth aspect of the present invention there is provided a method of enhancing 
the in vivo expression of a GOI in just the aleurone cells of a monocotyledon 0)referably a 
cereal) which comprises stably inserting into the genome of those cells a DNA conjugate 
comprising a Llp2 gene promoter as just defined and a GOI; wherein in the formation of the 
conjugate the Up! gene promoter is ligated to the GOI in such a manner that each of the 
myb site and the myc site in the Up! gene promoter is maintained substantially intact. 

According to a seventh aspect of the present invmtion there is provided the use of a myb site 
and a myc site in an Ltp2 gene promote to enhance in vivo expression of a GOI in just in 
the aleurone cells of a monocotyledon (preferably a cereal) wherein the Ltp2 gene promoter 
and the GOI are integrated into the genome of the monocotyledon. 

According to an ^hth aspect of the present invoition there is provided a method of 
enhancing the in vivo expression of a GOI in just the aleurone cells of a monocotyledon 
(pref^ably a cereal) which comprises stably ins^ting into the gmome of those cells a DNA 
conjugate comprising a Ltp2 gene promoter as just defined and a GOI; wh^in in the 
formation of th conjugate the Ltp2 gene promoter is ligated to the GOI in such a manner 
that any ne of the Sphl site, the AL site or the DS site in the Ltp2 gene promoter is (are) 
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maintained substantially intact. The Sphl site, the AL site and the DS site are defined later. 

Preferably the promoter is a barley aieurone specific promoter. 

Preferably the promoter is for a 7 kDa lipid transfer protein. 

Preferably the promoter is used for expression of a GOI in a cereal seed. 

Preferably the promoter is used for expression of a GOI in a monocotyledonous species, 
including a grass - preferably a transgenic cereal seed. 

Preferably the cereal seed is anyone of a rice, maize, wheat, or barley seed. 

Preferably the promoter is the promoter for Ltp2 of Hordeum vulgare. 

Preferably at least one additional sequence is attached to the promoter gene or is present in 
the conjugate to increase expression of a GOI or the GOI. 

The additional sequence may be one or more rq)eats (e.g. tandem repeats) of the promoter 
upstream box(es) which are responsible for the aieurone specific pattern of expression of 
Ltp2. The additional sequence may even be a Shi Antion. 

The term "GOI" with reference to the present invention means any gene of interest - but not 
the remainder of the natural Ltp2 gene for the cereal in question. A GOI can be any gene 
that is either foreign or natural to the cereal in question. 

Typical examples of a GOI include genes encoding for proteins giving for example added 
nutritional value to the seed as a food or crop or for example increasing pathogen resistance. 
The GOI may even be an antisense construct for modifiying the expression of natural 
transcripts present in the relevant tissues. 
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Preferably the GOI is a gene encoding for any one of a protein having a high nutritional 
value, a Bacillus thuringemis insect toxin, or an alpha- or beta- amylase or germination 
induced protease antisense transcript. 

The term "a variant thereoP with reference to the present invention means any substitution 
of, variation of, modification of, r^lacement of, deletion of or the addition of one or more 
nucleic acid(s) from or to the listed promoter sequence providing the resultant sequence 
exhibits aleurone specific expression. 

The term "substantial homology" covers homology with respect to at least the essential 
nucleic acids of the listed promoter sequMce providing the homologous sequence exhibits 
aleurone specific expression. Preferably thwe is at least 80% homology, more pref^ably 
at least 90% homology, and even more preferably there is at least 95% homology with the 
listed promoter sequence. 

The term **maintained substantially intact" means that at least the essential components of 
each of the myb site and the myc site remain in the conjugate to rasure aleurone specific 
expression of a GOI. Preferably at least about 75%, more preferably at least about 90%, 
of the myb or myc site is left intact. 

TTie term "conjugate", which is synonymous with the terms "construct" and "hybrid", covers 
a GOI directly or indirecfly attached to the promoter gene to from a Ltp2-GOI cassette. An 
example of an indirect attachment is the provision of a suitable spacer group such as an 
intron sequCTce, such as the 5/i7-intron, intermediate the promoter and the GOI. 

The present invention therefore provides the novel and inventive use of an aleurone specific 
promoter - namely the use of the Ltp2 gene promoter, preferably the Ltp2 grae promoter 
ftom barley. 

The main advantage of the present invention is that the use of the Ltp2 gene promote results 
in spedfic aleurone expression of a GOI in the aleurone layer(s) of cereals such as rice, 
maize, wheat, barley and other transg^c cereal seeds, preferably maize seed. 
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It is particularly advantageous that the expression is both stage- and tissue- specific. 

A further advantage is that the expression of the product coded for by a GOI in the aleurone 
layer of the endosperm gives minimal interference with the developing embryo and seedling. 
This is in direct contrast to constitutive promoters which give high levels of expression in 
the developing seedling and mature plant tissues which severely affect normal plant 
development. 

The present invention is particularly useful for expressing GOI in the aleurone layer of 
developing grains - such as cereal seeds. 

With regard to the present invention it is to be noted the EMBL database sequence listing 
(ibid) does not suggest that the LQ)2 gene promoter could be used to express a GOI in a 
stage- and tissue- specific manner. Also the database extract does not mention the 
importance of the myb gene segment or the myc gene segmrat. 

It is also to be noted the paper titled "Molecular Strategies For Improving Pre-Harvest 
Sprouting Resistance In Cereals" fibid) does not give any specific sequence listing 
information for the Lq)2 gene promoter. Also there is no explicit mention in this paper of 
using just the Ltp2 gaie promoter to induce expression in just aleurone cells. Moreover, 
there is no mention in this paper of an Ltp2 - GOI conjugate being formed. Also there is 
no mention in this paper of the importance of the myb site or the myc site. 

It is also to be noted that in the paper titled "Characterisation of Promoter Elements Of 
Aleurone Specific Genes From Barley" (ibid) there is no mention of an Ltp2 - GOI conjuagte 
stably integrated into genomic DNA of a cereal. Also there is no explicit disclosure of an 
in vivo expression system. Moreover, there is no full sequrace listing in diis paper for the 
Ltp2 gene promoter. Also there is no explicit mention in this paper of the importance of the 
myb site or the myc site of Ltp2 gaie promoter for in vivo GOI expression. 
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In contrast to the work disclosed in PCT WO 90/01551, the Ltp2 gene promoter (which is 
not disclosed in PCT WO 90/01551) the Ltp2 gene promoter results in aleurone specific 
expression in developing grains. 

In general, therefore, the present invention relates to a promoter for a Ltp2 gene encoding 
a 7 kDa nsLTP. In situ hybridization analysis demonstrates that the Llp2 transcript is 
expressed exclusively in aleurone cells from the beginning of the differentiation stage and 
half-way into the maturation stage. Further commentary on the maturation stages is provided 
by Bosnes et al., 1992. 

The Ltp2 gene promoter may be inserted into a plasmid. For example, the Ltp2 Bglll 0.84 
kb fragment can be inserted into the BamHl site of Bluescript. A GOI, such as GUS, can 
then be inserted into this conjugate (construct). Furthermore, a 5*1 intron can then be 
inserted into the Smal site of this conjugate. 

Stable integration may be achieved by using the method of Shimamoto (1989). Another way 
is by bombardment of an embryonic suspension of cells (e.g. maize cdls). Another way is 
by bombardment of immature embryos (e.g. barley embryos). 

With the present invention, it can be shown by using pardcle bombardments that the -807 bp 
Ltp2 gene promoter fused to a beta-glucuronidase (GKS) rqx)rter gene (which serves as a 
GOI) is active in the aleurone layer of developing barley seeds, giving 5% of the activity of 
the strong constitutive actin-promoter from rice. Also, in transgenic rice plants, the barley 
Ltp2-promoter directs strong expression of the GE/5-rqx>rter gene exclusively in the aleurone 
layer of developing seeds, suggesting the presence of conserved mechanisms for aleurone cell 
gene expression in the cereals. 

In a preferred embodiment, the Ltp2 gene encodes a TkDa barley seed nsLTP and has about 
80% idmtity to the wheat TkDa protein. 
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The transcript of the Ltp2 gene is detectable in the earliest moiphologically distinguishable 
aleurone ceUs and accumulates during the differentiation stage to decline finally during seed 
maturation. It can also serve as a molecular marker for the differentiating aleurone cells as 
shown in situ hybridisation experiments where the spatial distribution of the transcript is to 
be examined. 

In the present invention, a genomic clone was isolated using the cDNA insert of previously 
isolated cDNA clone pBzllE and characterised by DNA sequencing. 

The sequence of the cDNA and isolated genomic clone was found to be identical in the 
overlapping region. It was found the L^2 gene does not contain any intron. 

To prove that this is an active gene, the 5' region carried on a 845 bp DNA fragment 
delineated by two Bgl H restriction sites was fiised to the GUS gene (following Jefferson 
1987) and the construct was introduced into barley aleurone layers using micro projectile 
bombardment. Aleurone cells expressing GUS activity were detected proving that the gene 
promoter was indeed capable of driving the expression of the GOI in the relevant tissue. 

By comparing the DNA sequence of this active promoter sequences several putative c/^-acting 
elements with the potential of binding known transcriptional factors present in cereal aleurone 
layers were detected. They include the binding sites for transcriptional factors of the myb 
and myc class, namely TAACTG and CANNTG respectively. Our experimmts showed that 
the myb and myc sites were important for good levels of expression. 

Gel retardation experiments showed that the Ltp2 gene promoter has a myb site that is 
recognised by a MYB protein (e.g. from chicken). 

In the preset invention, mature fertile rice plants were regenerated from transformed 
cultured rice protoplasts. The developing seeds of these primary transformants were analysed 
for the expression of GUS, It was found that the barley seed Llp2 gene promoter confers 
aleurone specific expression in transgenic rice plants. This is the first example of an 
aleurone specific promoter in developing seeds of a transgenic cereal. 
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Hie following were deposited in accordance with the Budapest Treaty at the recognised 
dqxisitary The National Collections of Industrial and Marine Bacteria Limited (NCIMB) at 
23 St Machar Drive, Aberdeen, Scotland, UK, AB2 IRY, on 22 November 1993: 

(i) An E. Coli K12 bacterial stock containing the plasmid pLtp2pr - i.e. Bluescript containing 
the Llp2 gene promoter (Deposit Number NCIMB 40598). 

(To form pL^2pr, the Ltp2 promoter of Figure 2b (see later) contained on a Bgm fragment 
was inserted in the Bluescript KS vector into the BamHI site.] 

(ii) An E. Coli K12 bacterial stock containing the plasmid pLtp2/GN • i.e. Bluescript 
containing a Ltp2 gene promoter - GUS conjugate (Deposit Number NCIMB 40599). 

[To form pLtp2/GN, the GlAS-reporter gene cassette (ON) contained on the Smal-EcoRI 
fragment of the commercially available vector pBIlOl (Clontech Inc.) was cloned 
directionally into the Smal and EcoRI sites of pL^2pr.] 

(iii) An E. Coli K12 bacterial stock containing the plasmid pLtp2ABCIGN - i.e. Bluescript 
containing an Ltp2 gene promoter with a deletion spanning the myb and myc sites - GUS 
conjugate (Dqx>sit Number NCIMB 40601). 

[To form pLtp2ABCIGN, the Ltp2 promoter and the GN gMe was inserted as described for 
pLtp2pr and pLtp2/GN except for the use of Bluescript SK and that the Ltp2 promoter was 
deleted in the myb-myc region (using a PCR strategy) as explained in the legend of Figure 
7 (see later).] 

(iv) An E. Coli K12 bacterial stock containing the plasmid pLQ)2S/il/GN - i.e. Bluescript 
containing an L^2 gene promoter-Shl intron-Gt/5 conjugate (Deposit No. NCIMB 40600). 

[To form pLtp2^/il/GN, the Ltp2 promoter and the GN gene was inserted as described for 
pLtp2pr and pLtp2/GN except for the use of Bluescript SK. The Shi intron from maize 
contained on a HincVL restriction fragmrat was inserted into the Smal site of this construct.] 
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Other embodiments and aspects of the present invention include: A transformed host having 
the capability of expressing a GOI in just the aleurone layer; A vector incorporating a 
conjugate as hereinbefore described or any part thereof; A plasmid comprising a conjugate 
as hereinbefore described or any part thereof; A cellular organism or cell line transformed 
with such a vector; A monocotylenedonous plant comprising any one of the same; A 
developping seed comprising any of the same; and A method of expressing any one of the 
same. 

The present invention will now be described only by way of examples in which reference 
shall be made to the accompanying Figures in which: 

Figure 1 is a nucleotide sequence of the Ltp2 gene; 

Figure 2a is a nucleotide sequence of the Ltp2 gene promoter; 

Figure 2b is a nucleotide sequence of the Ltp2 gene promoter with an additional 

39 nucleotides for fusion to a GUS gene for transgenic rice and transient assay 

studies; 

Figure 3 shows transverse sections from the mid-region of barley seeds ( A-E) and 
steady state levels of the Ltp2 mRNA in different tissue fractions of developing 
barley endosperm; 

Figure 4 shows the results for an in situ hybridization experiment; 

Figure 5 is the result of a Soutiiem blot experiment using DNA from transgenic 
rice plants; 

Figure 6 shows the expression of a GMS/4-reporter gene drivCT by the Ltp2 gene 
promoter in the aleurone layer of developing transgenic rice seeds; and 

Figure 7 shows the position of the myb and myc binding sites in the barley Ltp2 
gene promoter. 
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A. METHODS 
K Plant material 

Seeds of Barley (Hordeum vulgare cv. Bomi) wore collected from plants grown in a 
phytotron as described before (Kvaale and Olsen, 1986). The plants were emasculated and 
pollinated by hand and isolated in ord^ to ensure accurate determination of seed age. 

ii« cDNA and genomic clones 

The isolation and sequencing of the aleurone specific cDNA clone pBzllE (which is the 
same as Up2) was conducted as described by Jakobsen et aL (1989). 

A barley, cv. Bomi genomic library was constructed by partial Mbol digestion of total 
genomic DNA and subsequent ligation of the 10-20 kilo basepair (kb) size fraction with 
Bamm digested lambda EMBL3 DNA (Clontech Labs, Palo Alto, Ca, USA). Out of a total 
of 2 X 10^ plaques screened, using die Bzl IE cDNA insert as a template for probe synthesis 
with a random labelling kit (Boehiinger-Mannheim), four positive clones (gHv29-101, 
gHv38-201, gHv53-201 and gHvS9-101) were identified after repeated rounds of plaque 
hybridization. DNA purified from these clones were restricted with several enzymes and 
charactmzed by Southern blot analysis. 

TTie restriction maps of the four clones showed extensive overlap. One clone, gHv53-201, 
containing an insert of around 12 kb, was chosen for further analysis. A 6 kb PstI fragment 
contained within the insert that hybridized to the cDNA probe was sut>cloned into Bluescript 
(Stratagene) giving the subclone BL53Psl7. A Nhel restriction fragment of 0.7 kb covering 
the coding rpgion of the Ltp2 gene was cloned into the Xbal site of M13mpi8 and sequmced 
using the Sequenase protocol (USB) aft» isolation of ssDNA template using PGR 
amplification and magnetic beads (Dynabeads M280- Streptavidin, Dynal). 
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In order to characterize the 5* and 3' sequences of the Ltp2 gene, the foUowing DNA 
fragments were generated by PGR amplification: 

i) a 1.2 kb fragment covering the 5' end from a vector primer (KS) to the PLTll primer 
located within the S'end of the cDNA; and 

ii) a 0.3 kb PGR product generated by amplification directed by the primers LTP13 and 
PLT15, of which the latter is based upon sequaice information from the cDNA clone Bzl4A, 
which is overlapping and identical with the BzllE cDNA but contains some additional 30 
base pairs after the polyadenylation site indicated at position 490 in Figure 1. 

The sequences are: 

KS: 5' CGAGGTGGAC GGTATCG 3' 

PLTll: 5' TAGGGTGATG TACTGGGCTA 3' 

LTP13: 5* ACGAAGCCGA GCGGCGAGT 3' 

PLT15: 5' GGAGTAAAAA AAAAGTTGCA AGAGAAATTT G 3'. 

The PLTll sequ^ice contains one base substitution (shown in bold and underlined) creating 
a Bgia restriction site. 

The 1.2 kb PGR product ccmtaining the 5' end was restricted with Bgin which gave a 0.84 
kb fragment with Bamta compatible sticky ends that was subsequently cloned into the BamHl 
site of pBluescript. 

The 0.3 kb PGR product of the 3* aid was treated with T4 DNA polymerase (Sambiook et 
al., 1989) and subsequently cloned into the ^ncH site of M13mpl8. 



Tlie sequences of the PGR products were determined as described above. 



wo 95/15389 



PCT/NO94/00190 



15 

iii. N rthern analyses 

Total RNA was extracted from barley seed tissues of 10 DAP and older plant material 
essentially as described by Logemann a al. (1987), exc^t that LiCl precipitation was used 
in place of ethanol precipitation. The RNA was denatured using formaldehyde and separated 
on 1.2% agarose gels as described by Selden (1987) and blotted onto GeneScreen (MEN) 
membranes using a Stratagene posiblotter apparatus according to supplier's instructions. 

Hybridization was according to GeneScreen instruction manual (NEN) using radioactively 
labelled DNA strands complementary to Ae pBzllE cDNA insert generated with a random 
primed DNA labeling kit (Boehringer Mannheim). 

iv. In situ hybridization 

For in vitro transcription of antisense RNA, the plasmid pBzl IE (Jakobsen et al. , 1989) was 
linearized with ft/I and transcribed with T7 RNA polymerase by using MAXIscript 
(Ambion) and [5,6-3H]-Uridine5'-triphosphate (40-60 Ci mmol-1) (Amersham International) 
according to the specifications of the supplim. The probe was hydrolyzed to fragments of 
about 100 bp as described by Somssich et al. (1988). Seed tissues were fixed in 1 % 
glutaraldehyde, 100 mM sodium ph(»phate (pH 7.0) for 2 hours and embedded in Histoplast 
(Histolab, Gdteborg, Swedoi). 

Sections of 10 /tm were pretreated with pronase (Calbiochem) as described by (Schmelzer 
et a/., 1988) and hybridized with 25 ml of hybridization mix (200 ng probe mI-1, 50% 
formamide, 10% (w/v) dextran sulphate, 0.3 M NaCl, 10 mM Tris-HCl, 1 mM EDTA (pH 
7), 0.02% polyvinyl- pyrroUdone, 0.2% FicoU, 0.02% bovine serum albumin) for 15 hours 
at 50 'C. 

Posthybridization was carried out according to Somssich et al. (1988) and autoradiography 
was done as described by Schmelzer et al. (1988), except that sections were exposed for 10 
weeks. 
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V. Constructs for transient expression analysis 

For the microprojectile bombardment experiments, the following constructs were used: 

CONTROL A: pActlf-Gi75 containing the rice Aciin 1 promoter fused to the uidA 

reporter gene encoding the GUS enzyme (McElroy et aL, 1990); 

CONTROL B: pRTlOl-ex/s-int/s-LUC containing the 35S CaMV promoter-5A7 

first exon/intron fiised to the firefly luciferase gene (Maas et al.^ 
1991); and 

CONTROL C: pRTlOlCl containing the CI cDNA downstream of the 35S CaMV 

promoter (Paz-Ares et al., 1987); 

CONTROL D: pMF6Lc(R) containing the Lc cDNA corresponding to one R gene 

allele coupled to the 35S CaMV promotcr-Adhl intron (Ludwig et 
al., 1989). 

For the transient expression studies in barley aleurone the first intion of the maize Shi gene 
carried on a 1.1 kb HincU fragment (Maas et aL, 1991) was inserted into the Smal site of 
the promoter-reporter grae constructs according to the present invention. The Up! gene 
promoter is contained on the 0.84 kb Bgia fiagment (sequence is presented in Figure 2) and 
was inserted into the BamHi site of pBluescript. Thereafter the structural uidA gene 
«coding the beta-glucuronidase (Gus) enzyme was fiised to the Ltp2 gene promoter. 

The following conjugates according to the present invention were studied: 

(i) L^2/GN: A L^2 gene promoter - GUS conjugate (same as conjugate in 

pLtp2/GN - see earlier); 

(ii) UpTShl/GN: A lAp2 gene promoter - Shi intron - GUS conjugate (same as 

conjugate in pLtp2Shl/OH - see earlier). 
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Isolated plasmid DNA was used in the bombardment studies. The same conditions were used 
for the control conjugates and for the conjugates of the present invention. 

For transient assay studies with rice protoplasts, the following conjugates according to the 

preset invention were studied: 

(i) Ltp2/GN: As above; and 

(iii) Ltp2ABCIGN: A Ltp2 gene promoter {with a deletion spanning the myb and myc 

sites} - GUS conjugate (same as conjugate in pLq>2aBCIGN * see 
earlier). 

vi« Transformation of barley aleurone layers by particle bombardment 

Barley seeds were harvested at 25 DAP, surface sterilized in 1 % sodium hypochlorite for 5 
min and then washed 4 times in sterile distilled water. The maternal tissues were removed 
to expose the aleurone layer and the seed was then divided into two, longitudinally along the 
crease. The pieces of tissue were thM placed, endosperm down, onto MS (Murashige & 
Skoog 1962) media with 10 g/1 sucrose solidified with 10 g/1 agar in plastic petri dishes (in 
two rows of 4 endosperm halves per dish). 

Single bombardments were performed in a DuPont PDS 1000 device, with M-17 tungsten 
pellets (approx. 1 /cm in diameter) coated with DNA as described by Gordon-Kamm et al. 
(1990) and using a 100 mm mesh 2 cm bdow the stopping plate. Equal amounts (2S fig per 
prqiaration) of the GUS (promoter-reporter gene) and LUC (internal standard) plasmids were 
mixed before adding the microprojectiles. One tenth of this amount, 2.S /xg, was used for 
the Lc and CI cDNA constructs. Bombarded tissue was incubated at 24*C for 3-4 days 
before extraction and measurement of GUS and LUC activities. Anthocyanin pigmentation 
could be observed in the bombarded aleurones directly without further treatment. 
Histochemical staining for GUS expression was performed with X-Gluc (S-bromo,4-chloro,3- 
indolyl,B-D,Glucuronic acid) as described by Jeff^son (1987) at 37*C for 2 days. Extraction 
of cdUular proteins for quantitative analysis was performed by grinding 4-8 half seeds in a 
mortar and pestle with O.S ml of extraction buffer (50 mM Na-phosphate pH, 1 mM DTT, 
pH 7.0). 



wo 95/15389 



PCT/NO94/00190 



18 

Afto: grinding, a further 0.5 ml was added and two 400 /il aliquots were taken. To one of 
these, 100 /<! of S x Lucifoase cell lysis buffer (Promega) was added and the sample 
vortexed before clearing by centrifiigation at 10,000 ipm. A 20 /d aUquot was then measured 
for LUC activity in a scintillation counter (Tri-Carb 4000), using the ludferase assay system 
of Prorata (E1500). To the other 400 /tl aliquot, 100 nl of 5x lysis buffer (500 mM Na 
phosphate pH 7.0. 50 mM EDTA, 10 niM DTT, 0.5% Sarcosyl, 0.5% Triton X-100) was 
added, the mixture voitraed and cleared as above and assayed for GUS activity using 4- 
methylumbdliferone, B-D.glucuronide as described by Jefferson (1987) modified to include 
5% methanol in the reaction mixture (Kosugi et al, 1990). 

Production of 4-methylumbelliferone (MU) was measured after 1 and 4 h using a TKO 1000 
Mini-Fluorimeter (Hoefer Scientific Instruments). In the analysis of promoter activities, the 
GUS readings (MU produced per hr) were standardized by division with the LUC value 
(photons produced over 30 s, beginning 60 s after mixing) from the same extract. 

vii. Transient assay of rice protoplasts 

In this experiment, the same type of protoplasts as used for stable transformation of rice 
plants was transienUy transformed with constructs (i) and (iu) (see above) and then assayed 
for GUS activity. 

viii. Rice transformation 

Southern blot analysis of transgenic rice plants 

Total genomic DNA was isolated from mature leaves, digested with Xba I and then 
transferred to a nylon membrane (Amersham). The coding region of tht GUS gene was 
labelled and amplified witfi digoxigenin 11-dUTP by polymerase chain reaction and used for 
probing the Ltp2 - GUS gene. Hybridization and chemiluminesence signal detection were 
performed according to manufacturers specifications (Boeringra Mannheim). 
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^ RESULTS WITH RF.PF RENCE TO THE FIGURRS 

i. Figure 1 is a nucleotide sequence of the Ltp2 gene. A transcription start site has 

been assigned as +1. The TATA consensus sequence is boxed. Consensus myb and myc 
binding sites and the Sphl element (Hattori et al., 1992) found in the CI promoter sequence 
are shown in bold italics. 

In the ORF (open reading frame), the nucleotides are shown in bold letters, starting with the 
first ATG codon and ending with the TAG stop codon. The single base substitution 
introduced at position +41 ( A>T) creates a Bgia restriction site which deUmits the 3' end 
of the fragment covering the Ltp2 gene promoter. The positions of the 5' end and 
polyadenylation site of the corresponding cDNA, BzllE (Jakobsen et al., 1989), are shown 
by arrows. Two putative polyadenylation signals are underlined. 

U. Figure 2a is a nucleotide sequence for the Ltp2 gene promoter. Figure 2b is a 

nucleotide sequence for the Ltp2 gene promoter with an additional number of nucleotides for 
fusion to a GUS gene. 

ill. Figure 3 shows transverse sections from the mid-region of barley seeds (A-E) and 
steady state levels of tiie Ltp2 mRNA in different tissue fractions of developing barley 
oidosperm (F). 

Figure 3 can be analysed as foUows: 



(A) : Ten DAP (days after pollination) endosperm isolated from the surrounding 
mat»nal tissues by manual extrusion. For naxemai tissues, see (C). The ^truded 
endosperm consists of the central stardiy endosperm cells, a group of modified 
aleurone cells over the raease area (arrow) and one layer of highly vacuolated 
poipheral aleurone cells (arrowhead). 

(B) : Enlaigemoit showing vacuolated peripheral aleurone cdls (AC) and starchy 
endospom cells (SE) in area of (A) marked with arrowhead. 
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(C) : Pericarp of 10 DAP seed after extrusion of the endosperm with the nucellus 
epidermal layer (NE) facing the endosperm cavity, which contained the endosperm 
in (A) before extrusion. 

(D) : Extruded 15 DAP endosperm with central starchy endosperm cells and 
modified aleurone cells (arrow), but without peripheral aleurone cells. 

(E) : 15 DAP pericarp with adhering aleurone layer after extrusion of the 
endosperm (in D). 

(F) : Northern blot showing the steady state level of Ltp2 mRNA in the extruded 
endosperm fraction (e) and the pericarp fraction (p) in the interval from 10 to 13 
DAP. For this blot, 10 fig of total RNA was loaded in each lane. The gel was 
blotted and hybridized with randomly primed Ltp2 cDNA. 

iv* figure 4 shows the results for an in situ hybridization of ^H-labelled Up2 antisense 
probe to transverse sections of barley endosperm (A and B) and transient gene ^pression 
analysis of different promoter-reporter gene constructs in developing barley aleurone layers 
after particle bombardment (C, D and E). Figure 4 can be analysed as follows: 

(A) : Dark field microphotogra|di of 13 DAP endosperm showing hybridization of 
the lAp2 probe to the peripheral aleurone cells (AL) ventrally and laterally, but not 
to aleurone cells on the dorsal side of the grain (DS), nor to the modified aleurone 
cells over the crease area (MA). 

(B) : Magnification of peripheral endosperm (frame in A) showing gradient of in 
Sim hybridization signal towards the dorsal side of the seed containing 
undifferentiated aleurone cdls. 

(C) : Colourless barley aleurone layer co-bombarded with the 35S'C1 and 35S'Lc 
cDNA constructs. Single aleurone cells synthesizing anthocyanin pigmmt appear 
as red spots. 
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(D) : Exposed aleurone layer of 25 DAP barley seeds bombarded with the lJtp2/ShJ 
inXJGUS construct. The transfected seed was stained for detection of GUS activity. 

(E) : Exposed aleurone layer of barley seed of the same stage bombarded with 
pActlf-GUS construct and histochemically stained as in (D). 

(V): Ventral crease area. 

V. Figure 5 is the result of a Southern blot experiment of DNA from transgenic rice 
plants harbouring the Up2-GUS construct. Figure 5 can be analysed as follows: 

Lane P: plasmid Utpl-GUS. 

Lane C: untransformed control plants. 

Lane 1: transgenic line 3-lS. 

Lane 2: transgenic line 4-13. 

Lane 3: transgenic line 2-6. 

Lane 4: transgenic line 4-6. 

vl. Figure 6 shows the expression of a GC/5-reporter gene driven by the Ltp2-wildtype 
promoter in ttie aleurone layer of developing transgenic rice seeds. Figure 6 can be analysed 
as follows: 

(A) : Longitudinal section of 20 DAP seed showing GUS staining exclusively in 
the aleurone layer (AL), but not in the embryo, starchy oidospom (SE) or in the 
maternal tissue (M). 

(B) : Transverse section from the mid-region of 20 DAP seed. 
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(C) : Enlargement of dorsal side of seed shown in (A). 

(D) : Non-transgenic control seed, same age as in (A). 

(E) : A S day*old seedling transformed with the Ltp2 - GUS gene. 

(F) : A 5 day-old seedling transformed with the CaMV35S - GUS gene. (Terada 
and Shimamoto 1990). Arrows indicate regions of GUS expression. Bars in (A,B 
and D) are 10 mm and in (C) 2.S mm. 

vu. Figure 7 shows the position of the myb and myc sites in the barley Ltp2 gene 
promoter. The distance from the 3* end of the myc site to the TATA box is given in 
nucleotides. The following nucleotides from and between the myb and myc sites were 
deleted to form the conjugate containing the ddetion in the Ltp2 grae promoter gene: 

CAACTACCATCGGCGAACGACCCAGC. 
CONCLUSIONS 

i. The barley Up2 gene encodes a protein homologous to the 7 kDa wheat lipid 

tranter protein 

Using the BzllE cDNA (Jakobsen et al., 1989) as a probe, the corresponding barley cv. 
Bomi genomic clone was isolated. Hie sequences of the genomic clone and that of the Bzl IE 
cDNA are identical in overlapping regions and no intervening sequences were detected 
(Figure 1) accordingly this gene is Ltp2. The ATG codon initiating the longest open reading 
frame (ORF) in the Ltp2 sequence is located 64 bp downstream of the putative transcriptional 
start site at nucleotide number 1 (Figure 1). The ORF contains eight potential translation start 
codons between nucleotides 64 and 127. Two polyadenylation signals, which conform to the 
plant consensus sequence (Joshi, 1987) are found in the 3' end of the genomic sequence. In 
the BzllE cDNA the poIyA tail extmds after the G at position 491 (Figure 1 and Figure 2). 
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2. The Ltp2 transcript can be a molecular marker for peripheral aleurone cell 

differentiation 

In the developing seed at approximately 8 days after pollination (DAP), aleurone cell 
differentiation is initiated over the ventral crease area resulting in the formation of the 
modified aleurone cdls (Figure 3A and Bosnes et al., 1992). Shortly after, at 9 DAP, the 
second type of aleurone ceUs, diaractnized by thdr ^tmsive vacuolation figure SB), 
appears in the peripheral endosperm close to the crease area, spreading first laterally and 
then to the dorsal side of the seed (see Figure 3A). At this stage, when whole de- 
embryonatBd seeds are squeezed, the extruded endosperm consists of the starchy endosperm, 
the peripheral and the modified aleurone cells (Figure 3 A-C). This is in contrast to later 
developmental stages, where the extruded oidosperm consists only of the starchy eaidosperm 
and ttie modified aleurone cells (Figure 3D). The reason for this is that the ceUs of the 
aleurone layer adhere to the maternal pericarp (Figure 3E). Aleurone ceU formation is 
completed at 21 DAP, when cdl division stops (Kvaale and Olsen, 1986). Using the Ltp2 
probe on Northern blots with total RNA, the signal obtained gradually becomes stronger in 
the pericarp, compared to the extruded oidosperm, confirming the relocation of the aleurone 
cells from the endosperm fraction to the pericaip fraction in the interval between 10 and 13 
DAP (Figure 3F). 

From the experimental results presented in Figure 3 if is concluded that the Ltp2 transcript 
is a potoitial marlrer for aleurone ceU differoitiation. To corrc^Kuate the usefiilness of the 
Ltp2 transcript as a molecular marker for aleurone cdl difieiratiation, in situ analysis was 
carried out on transverse sections of 13 DAP seeds. The rationale for using seeds from this 
stage was the earUer observed gradual differentiation of the peripheral aleurone cells, starting 
near the crease area and spreading to the dorsal side (Bosnes et al., 1992). 

Using ^H-labelled antisense transoipt as probe, a positive signal is deariy visible in the 
peripheral aleurone cdls in the ventral part adjacent to the crease area as well as laterally up 
towards the dorsal side of the grain (Figure 4A). However, no signal is presmt in the dorsal 
r^ion of the seed, nor over the modified aleurone cells. 
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Focusing on the most dorsal aleurone cells showing a positive signal in the in situ analysis 
(Figure 4B), the morphology of these cells corresponds to that of the highly vacuolated 
peripheral aleurone cells in 10 DAP endosperm (Figure 3B). 

The Ltp2 transcript therefore represents a highly tissue specific molecular marker for 
aleurone cell differentiation. 

3. The Ltp2 gene promoter is transiently expressed in developing barley aleurone cells 

cfier particle bombardment 

The Ltp2 gene promoter contained on a 842 bp Bgm restriction fragment (from nt -807 to 
nt +35 in Fig.l) was fused to the GXJS-ie^xX&t gene and introduced into the exposed 
aleurone layers of 25 DAP whole barley seeds by the biolistic method. In the first set of 
experiments, Ltp2 gene promoter activity was assayed visually after histochemical staining 
with X-Gluc. Due to the large variation between individual experiments with the biolistic 
method, plasmid DNA containing the Lc and CI cDNAs from maize under the control of the 
35S CaMV promoter was co-bombarded with the Ltp2 construct to monitor shooting efficacy. 
In combination, but not individually, the latter two cDNAs give high numbers of red 
anthocyanin spots in the barley aleurone cells without any treatment after 1 to 2 days of 
incubation of the seeds on solid nutrient medium (Figure 4C). Compared to the number of 
red spots, the Ltp2-Gl/5 construct consistently gave very few spots after histochemical 
staining in co-bombardment experiments. 

Based on previous reports that insertion of introns in promoter construct enhance the level 
of transient expression (Maas et aL, 1991) without interfering with the tissue specificity of 
the promoters, the intron from the maize Shrunken-1 gene was inserted into the LXpl-GUS 
construct after the promoter. Using this construct the number of spots in immature aleurone 
laym increased (Figure 5D). Still, howev^, compared to aleurone layers bombarded with 
the pActlf-GUS construct (McElroy et al., 1990), which contains the promoter of the 
constitutively expressed Actinl gene from rice (Figure 4D), both the number and the size of 
the spots obtained with the Upl-GUS construct is dramatically smaller (Figure 4E). 
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In order to quantify Ltp2 gene promoter activity in particle bombardment experiments, 
another control plasmid containing the LVC gene under the control of the 35S-promoter was 
co-bombaided with the Upl-GUS constructs. In this way, after particle bombardment and 
incubation on tissue culture medium, protein was extracted from the seeds in a buffer that 
allowed measurement of both LUC and GC/5 activity (for details, see Materials and Methods 
section). In such experiments, calculating GUS expression standardized on the base of the 
LUC-activity, tiie Ltp2-Gt/5 activity was not significanfly higher tiian background, 
corresponding approximately to 1.5% of die Actinlf promoter activity in parallel 
experiments. 



For tiie Ltp2-5A7 intron-GIAS construct, however, tfie activity was significanfly higher than 
background, corresponding to 5 % of tfiat of the Actinl promoter. Blue spots from tiie activity 
of the Ltp2-promoter wctc never observed in otfier tissues tfian the aleurone layer of 
developing seeds. From tfiese »periments it is concluded tfiat tiie -807 bp promoter of the 
Up2 gMe is c^ble of directing transient gene expression in a fashion similar to that of the 
endogenous Ltp2 gene, i.e., in tfie cells of tfie aleurone layer of immature barley seeds. 

4. The Lip2 gene promoter directs aleurone specific expression of the GUS-reporter 

gene in transgenic rice seeds 

The grae was transformed into rice by electroporation of embryogenetic protoplasts following 
the teachings of Shimamoto et al. 1989. 

Four fertile transgraic rice plants were obtained and integration of the Upl-GUS gene was 
examined by Soutiiem blot analysis. The results demonstrated that a 2.9 b fragment 
containing the Up2-GUS gene is int^rated in all tiie transg«iic lines. Histochemical GUS 
analysis was carried out witii developing rice seeds of 20 DAP and 5 day old seedings 
derived from transgmic seeds (Figure 6). In developing seeds ttie GUS expression is stricdy 
limited to the aleurone layer, with no staining observed in the maternal tissues, starx^hy 
endosperm or in the embryo of the transgenic seeds (Figure 6 A-C), nor in untransformed 
control seeds (Figure 6 D). No GUS staining was observed in leaves or roots of seedlings 
transformed with the Ltp2 - GUS gene (Figure 6 E). 
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In contrast, seedlings tzansfonned with the CaMV35S - GUS gene GUS expression is 
detected in the colq)tiie, shoots and roots (see Figure 6 F; Terada and Shimamoto 1990). 

These results clearly demonstrate the aleurone-specific expression of the Ltp2 - GUS gene 
in transgenic rice plants. 

5. The Ltp2 gene promoter contains sequence elements implicated in the 

transcriptional control of cereal endosperm specific genes 

The studies with the deletion spanning the myb and myc sites in the Ltp2 gene promoter 
showed that levels of expression were about 10% of that of the wild-type gene promoter. 
These studies indicated that both the myb and myc sites are important for expression. 

In addition, the Ltp2 gene promoter may even contain another sequence element that has 
been implicated in regulation of gene expression in maize aleurone cells, namely the 
octanucleotide CATGCATG (Figure 1). This sequence, previously referred to as the Sphl 
element, has been shown to mediate the transcriptional activation of maize CI by interaction 
with VPl (Hattori et al., 1992). As in the maize CI promoter (Paz-Ares et aL, 1987), the 
putative Sphl element of the barley Ltp2 gene promoter is located between the TATA-box 
and the myb binding site. 

In addition, the Ltp2 gene promoter may contain two further sites that could play an 
important role in transcription. The first site is an "AL** site and has the sequence 

CATGGAAA 

This AL sequence ends at position -366 in the sequmce shown in Figure 1. 

The second site is a •*DS" site that has a high degree of similarity or identity with the binding 
site for 5' transcriptional factors from other eucaryotic organisms. This DS site, which a 
dyad-symmetry, has the sequence 

TCGTCACCGACGA 
This DS sequence mds at position -121 in the sequence shown in Figure 1. 
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DISCUSSIQK 

The above examples of the present invmtion concern the barley gene Ltp2, which encodes 
an aleurone specific 7 kDa nsLTP, 

The identification of the lJp2 protein as a nsLTP is based on the high identity (78 %) between 
the predicted Up2 amino acid sequence and the 7 kDa protein isolated from wheat seeds 
using the in vitro lipid transfer assay (Monnet, 1990). The high degree of sequence identity 
between the two barley aleurone Ltp gene products and the homologous proteins and 
transcripts from wheat seeds strongly suggests that the aleurone layer of these two cereals 
contain two related classes of nsLTPs with molecular masses of 10 and 7 kDa, respectively. 

While the sequence idMtity is more than 70% within the two classes, it is only around 20% 
between them. However, several conserved features are apparent in the cereal seed nsLtps, 
including similar N-terminal signal peptides, an internal stretch of 20 amino acids with 60% 
similarity, and 8 cysteine residues that are believed to be important for the activity of plant 
Ltps (Tchang et aL, 1988). Studies also showed that the Ltp2 gene lacks an intron. 
Hybridization experiments using L^2 probes to barley genomic Southern blots indicate that 
the barley haploid genome contains only one copy of each gene (Jakobsen et aL , 1989; 
Skriver ef fl/., 1992). 

According to a suggestion by Sterk et al. (1991) plant nsLTPs may be involved in the 
extracellular transport of cutin or other lipid monomers from the endoplasmic reticulum to 
the site of synthesis of extracellular matrix components, such as the cuticle. Therefore, one 
possible role for the aleurone specific nsLTPs in barley and wheat could be in the formation 
of the earlier described amorphous layer on the outside of the aleurone cells in wheat seeds 
(Evers and Reed, 1988). The function of this layer is unknown, but it may be involved in 
the regulation of the osmotic pressure in the endosperm during seed development and 
germination. If this holds true, the absence of the Ubpl transcript in the modified aleurone 
cells in the vratral crease area is functionally significant, since an impermeable layer on the 
outside of these cells would prevent the influx of soluble synthates from the vegetative plant 
parts. 
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Of the nine cDNAs isolated in the differential screening experiment design to idenUfy clones 
representing transcripts differentially expressed in the aleurone layer of developing barley 
seeds, Up2 hybridizes to transcripts present exclusively in the aleurone layer. Thus, the 
Llp2 gene represents a suitable gene for the search for promoter sequences responsible for 
the control of gene transcription in the aleurone layer. 

Due to the lack of a routine protocol for stable barley transformation, demonstration of Ltp- 
promoter specificity in barley has to rely on transient assays using the particle bombardment 
method. Using this method, it was demonstrated that the -807 bp Ltp2 gene promoter 
carried on the Bglll restriction fragment is capable of driving the expression of the GUS 
reporter gene in immature barley aleurone layers. From this it is concluded that the promoter 
fragment carries sequences that are responsible for barley aleurone specific gene 
transcription. 

The Ltp2 gene promoter can be weaker than constitutive cereal promoters like that of the 
Actinlfgen& - even after the introduction of the 5Zii-intron (see Maas et al. (1991) and tiieir 
work on tobacco protoplasts) into Uie Ltp2-Ct/5 construct which increases the expression 
levels by around three-fold. However, this lower expression does not result in any damage 
to the developing seedling - unlike the constitutive cereal promoters. Moreover, and again 
unlike Uie constitutive cereal promoters, tiie Ltp2 gene promoter directs desirable tissue and 
stage specifc expression. 

As demonstiated by the histochemical assays shown in Figure 6, the Ltp2 Bgm promoter 
fragment shows the same aleurone specific expression in developing rice seeds as in barley. 

Thus, die conclusion from the transient assays in barley that this promoter fragment contains 
sequences responsible for aleurone specific gene transcription is confirmed. Furthermore, the 
data from rice provide support to the view tiiat the molecular mechanisms underlying 
aleurone specific goie transcription in developing seeds are conserved among the cereal 
species. 
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Ex. SUMMATION 

The Examples describe the isolation of the promoter for the barley gene Utp2, which encodes 
a novel class of cereal 7 kDa nsLTPs. The gene was isolated by the use of a cDNA from 
a differential screening experiment in which the positive probe was constructed from aleurone 
cdl poly (A) rich RNA, and the negative probe from the stardiy radosperm of immature 
seeds. 

In situ hybridization analysis demonstrates that the Ltp2 transcript is expressed exclusively 
in the aleurone layer from the beginning of the differmtiation stage and half way into the 
maturation stage. Similar to previously identified 10 kDa plant nsLTPs, the Ltp2 protein 
contains the eight conserved cysteine residues. 

The results indicate that the lAp2 protein is involved in the synthesis of a lipid layer covering 
the outside of the cereal aleurone cells. 

Using particle bombardments it was shown that the -807 bp Ltp2 gene promoter fused to the 
G£/5-reporter gene is active in the aleurone layer of developing barley seeds, giving 5% of 
the activity of the strong constitutive actinlf-promoter from rice. Transformed into rice, the 
barley Ltp2-promoter directs strong expression of the Gl/5-rqx>rter gene exclusively in the 
aleurone layer of developing rice seeds. Analysis of the L^2 gene promoter reveals the 
presence of sequence motives implicated in endosperm specific gene expression in maize, i.e. 
the myb and myc protein binding sites. In short, the Ltp2 gene promote represents a 
valuable tool for the expression of GOIs in the aleurone layers of cereal seeds. 

Other modifications of the present invention will be apparent to those skilled in the art 
without dqiarting from the scope of the invention. 
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SEQUENCE LISTINn 



(1) GENERAL INFORMATION 



NAME OF APPLICANTS: 
BUSINESS ADDRESS: 



O.-A. OLSEN AND R. KALLA 
PLANT MOLECULAR BIOLOGY LABORATORY 
DEPARTMENT OF BIOTECHNICAL SCIENCES 
AGRICULTURAL UNIVERSITY OF NORWAY AND 
AGRICULTURAL BIOTECHNOLOGY PROGRAM NRC 
NORWAY N-1432 



TITLE OF INVENTION: 



PROMOTER 



(2) INFORMATION FOR SEQUENCE LD. 1 



SEQUENCE TYPE: 
MOLECULE TYPE: 
ORIGINAL SOURCE: 
SEQUENCE LENGTH: 
STRANDEDNESS: 
TOPOLOGY: 
SEQUENCE: 

-807 

-780 GTTAACCGTC 
AAGCCGATGA 
CAAGAGGTTT 

-660 TCTAGTAGTA 
AAATATTTTG 
ATGTCACTCT 

-540 AGGTITTGAC 
ACAATTTTAT 
GTATCACAAA 

-420 GAATGTGAAA 
TTCATGGCAT 
TAAGAAAAAA 

-300 TCATGAGACA 
GGATGATGCG 
CGCCTACCGC 

-180 CGAACGACCC 
GCTACCTTCG 
GTGCCCCCGC 

-60 GTGCGTGGCT 
AGCTAGAAAC 



NUCLEIC ACID 
DNA (GENOMIC) 
BARLEY 
-807 

DOUBLE 
LINEAR 



GATCTCG 
TCTTCGTGAG 
GGATAAATAA 
ACTCATCAAG 
CATCGGACCT 
TGCTCATTTA 
AGGmTGAC 
AAATAATTTC 
TTTACmTA 
TGCCACTCTA 
AAAAACACTC 
GGAAATGTGA 
TTGTACTCCT 
ATCGCGTTTG 
CATGAATGGA 
CCACTGAGTC 
AGCTGACCTC 
TCAGCGACGA 
ATGCATGGCG 
GGCTACAAAT 
TTACACCTGC 



ATGTGTAGTC 

AATAACCGTG 

AATGTGGTGG 

AGGATGCTTT 

CACATACCTC 

GTGATGGGTA 

ATTTCAGTTT 

CATTCCGCGG 

CCACTCTTAG 

GAAATTCTGT 

ACTTATTTGA 

CATAAAGTAA 

CGTAACAAGA 

GAAGGCTTTG 

GTCGTCTGCT 

CGGGCGGCAA 

TACCGACCGG 

TGGCCGCGTA 

GCACATGGCG 

ACGTACCCCG 



TACGAGAAGG 
GCCTAAAAAT 
TACAGTACTT 
TCCGATGAGC 
CATTGTGGTG 
AATTTTGTTT 
TGCCACTCTT 
CAAAAGCAAA 
CnrCACAAT 
TTATGCCACA 
AGCCAAGGTG 
CGTTCGTGTA 
GACX3GAAACA 
CATCACCTTT 
TGCTAGCCTT 
CTACCATCGG 
AdTGAATGC 
CGCTGGCGAC 
AGCrCAGACC 
TGAGTGCCCr 



NOTE: ABOVE SEQUENCE IS A RETYPED VERSION OF FIGURE 2A WmCH IS TO 
BE TAKEN AS THE CORRECT SEQUENCE 
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INDrCATIONS RELATINGTO A DEPOSITED MICROORGANISM 

(PCX Rule I36tf) 



A. The indicaUons made below relate to the microorganism referred to in the description 
on page LL .line 6 



B. IDENTIFICATION OF DEPOSIT Further deposits are identified on an addiUonal sheet Q 



Name of depositary institution 
The National Collections of Indtistrial and Marine Bacteria Limited (NCIMB) 



Address of depositary institution fincUtding postal code and eauiory) 

23 St. Machar Drive 
Aberdeen 
Scotland 
AB2 IRY 



Date of deposit 

22 November 1993 


Accession Number 

NCIMB 40598 


C. ADDFTIONAL INDICATIONS f/«xi«6faiiAi/«oCff/p|^ 


c) This information is continued on an additional sheet \ \ 



In respect of those designations in which a European patent is sought, and any 
other designated state having equivalent legislation, a sample of the deposited 
microorganism will be made available until the publication of the mention of the 
grant of the European patent or until the date on which the application has been 
refused or withdrawn or is deemed to be withdrawn, only by the issue of such a 
sample to an expert nominated by the person requesting the sample. (Rule 28(4) 
EEC). 



D. DESIGNATED STATES FOR WmCH INDICATIONS ARE MADBOfih: indications are not for all designated States) 



E. SEPARATE FURNISHING OF INDICATIONS (leave blank if not applicable) 



The indications listed below will besubmitted to the International Bureau Xzier (specify Uie general nature of the indications e,g., 'Accession 
N umber ofuepont *) 



For receiving OCGce use only 



I I This sheet was received with the international applicati n 



Authorized officer 



Porm PCr/RO/134 (July 1992) 



For international Bureau use only 



[ I This sheet was received by the International Bureau on: 



Authorized officer 
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INDICATIONS RELATINGTO A DEPOSITED MICROORGANISM 

(PCTRule I3bis) 



A* The indications made below relate to the mtcvoorginism leferrcd to in the description 
on page - 11 13 



B. IDENTIFICATION OF DEPOSIT Further deposits are identified on an additional sheet Q 



Name of depositaiy institution 

The National Collections of Industrial and Marine Bacteria Limited (NCIMB) 



Address of depository institution Oncluding postal coiccndcomary) 

23 St. Machar Drive 
Aberdeen 



Scotland 




AB2 IRY 




United Kingdom 




Date of deposit 


Accession Number 


22 November 1993 


NCIMB 40599 


C. ADDmONAL INDICATIONS Ocav<, blank ifnoi cppUoAl^ This infomatlon is continued on an addiUonal sheet fl 



In respect of those designations in which a European patent is sought, and any 
other designated state having equivalent legislation, a sample of the deposited 
microorganism will be made available until the publication of the mention of the 
grant of the European patent or until the date on which the application has been 
refused or withdrawn or is deemed to be withdrawn, only by the issue of such a 
sample to an expert nominated by the person requesting the sample. (Rule 28(4) 
EPC). 



D. DESIGNATED STATES FOR WHICH INDICATIONS ARE MADE Ofihe indicctions are not for all dc^ffuUed Siat^) 



E. SEPARATE FURNISHING OF INDICATIONS (leave blank if iwi applicable) 



N^^^^D^in^^^ below will be submi tted to the Intematlonal Bureau later (specify iSte general nature of the indications e.g„ 'Accession 



For receiving Office use only 



Q This sheet was received with the international applicacii 



Authorized officer 



Fonn PCr/RO/134 (July 1992) 



For International Bureau use only 



1 1 This sheet was received by the International Bureau on: 



Authorized officer 
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INDICATIONS REL^TINGTO A DEPOSITCD MICROORGANISM 

(PCTRule I3bis) 



A. Hie indications made below relate to the microorganism lefened to In the description 



B. IDENTIFICATION OF DEPOSIT 


Farther deposits are identified on an additional sheet ( | 


Name of deposltaiy institution 




The National Collections of Industrial 


and Marine Bacteria Limited (NCIMB) 


Address of depositary institution (including postal code and mnuy) 


23 St. Machar Drive 




Aberdeen 




Scotland 




AB2 IRY 




United Kingdom 




Date of deposit 


Accession Number 


22 November 1993 


NCII4B 40601 


C. ADDITIONAL INDICATIONS (leave blank if not appTicabk) This infomxation is continued on an additional sheet 



In respect of those designations in which a European patent is sought, and any 
other designated state having equivalent legislation, a sample of the deposited 
microorganism vrill be made available until the publication of the mention of the 
grant of the European patent or until the date on which the application has been 
refused or vxtJidrawn or is deemed to be withdrawn, only by the issue of such a 
sample to an expert nominated by the person requesting the sample. (Rule 28(4) 
EPC). 



D. DESIGNATED STATES FOR WHICH mmCATlONS AUE MADE (if the inMcations are not for all designated S^^^^ 



E. SEPARATE FURNISHING OF INDICATIONS (leave blank if not applicable) 



The indiations listed belowwill be submitted to the ImeraatlonalBunau later (x/^^ 'Accession 
Number of Deposit') 



For receiving OfCoe use only 



n Ibis sheet was received with the International application 



Authorized oOicer 



For Intematicmal Bureau use only 



n This sheet was received by the International Bureau on: 



Authorized ofHcer 



Fomi PCr/RO/134 (July 1992) 
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INDICATIONS RELATING TO A OEPOSITJSD MICROORGANISM 

(PCTRuIe I3bis) 





1 /\. t nc inaicaiions niaae oeiow relate to the niiGnx>rgaiusra referred to in the description 
1 on page jine 




1 B. IDEM11.1CATION OF DEPOSIT Further deposits are identified on an additional sheet Q 




1 Naraeof depositary institution 

The National Collections of Industrial and Marine Bacteria Limited (NCIMB) 




j Address of depositary instimtion Ondtsdittg postal code and cotuitry) 

23 St. Machar Drive 
j Aberdeen 
1 Scotland 

AB2 IRY 
1 United Kingdom 




1 Date of deposit 

1 22 November 1993 


Accession Number 

NCIMB 40600 




C ADDmONAL INDICATIONS Hca^blaiikifMappHcab^ ThH information is continued on an additional sheet □ 


In respect of those designations in which a European patent is sought, and any 
other desxgnated state having equivalent legislation, a sample of the deposited 
^««?''''f ^l^^^^"^^^ "^^^ available until the publication of the mention of the 
1 grant of the European patent or until the date on which the application has been 
re^usea or vxthdrawn or is deemed to be withdrawn, only by the issue of such a 
sample to an expert nominated by the person requesting the sample, (Rule 28(4) 
£FC^ • 


D. DESIGNATED STATES FOR WHICH INDICATIONS ARE MADE (if U^. indications arc noi for all dcsipu^t^^^^^ 






E. SEPARATE FURNISHING OF INDICATIONS (leave blank ifnoi applicable) 




^^'^D^^tT^ beJowwiil be submitted to the International Bureau later (specify U^c genial nature of ih^indicaiions eg,. 'Accession 


I 
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CLAIMS 

1 . An in vivo expression system comprising a conjugate comprising a GOI and a Ltp2 
gene promoter comprising the sequence shown as SEQ. I.D. 1, or a sequence that has 
substantial homology with that of SEQ. I.D. 1, or a variant thereof; wherein the conjugate 
is integrated, preferably stably integrated, within a monocotyledon's gmomic DNA. 

2. A transgenic cereal comprising a conjugate comprising a GOI and a Ltp2 gene 
promoter as defined in claim 1 wherein the conjugate is integrated, piefoably stably 
integrated, within a cereal's genomic DNA. 

3. The in vivo expression in the aleurone cdls of a monocotyledon of a conjugate 
comprising a GOI and a Ltp2 gene promoter as defined in claim 1; wherdn the conjugate 
is integrated, i»referably stably integrated, within the monocotyledon's genomic DNA. 

4. A method of anhancing in vivo expression of a GOI in just the aleurone cells of 
a monocotyledon which comprises stably inserting into the genome of those cells a DNA 
conjugate comprising a Up! gene promote as defined in claim 1 and a GOI; wherdn in the 
formation of the conjugate the Ltp2 gene promote is ligated to die GOI in such a manner 
that each of the myb site and the myc site in the Ltp2 goie promoter is maintained 
substantially intact. 

5. Use of a myb site and a myc site in a Ltp2 gme promote to oihance in vivo 
expression of a GOI in just in the aleurone cells of a monocotyledon wherdn the Ltp2 gene 
promoter and the GOI are integrated into the gmome of the monocotyledon. 

6. A conjugate comprising a GOI and a Ltp2 gene promoter comprising the sequmce 
shown as SEQ. I.D. 1, or a sequmce that has substantial homology with that of SEQ. I.D. 
1, or a variant thereof. 

7. The invention of any one of claims 1 to 6 wherein the promoter is a barley 
aleurone specific pr moter. 
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8. The invention of claim 7 wherein the promoter is for a 7 kDa lipid transfer protein. 

9. The invention of any one of claims 1 to 8 wherein the promoter is used for 
esqpression of a GOI in a cereal seed. 

10. The invention of any one of claims 1 to 9 wherein the promoter is used for 
expression of a GOI in a transgenic cereal seed. 

11. The invention of any one of claims 1 to 10 wherein the cereal seed is any one of 
a rice, maize, wheat, or barley seed, preferably maize. 

12. The invOTtion of any one of claims I to 11 wherein the promoter is the promoter 
for Ltp2 of Hordewn vidgare. 

13. The invention according to any one of the preceding claims wherein the conjugate 
further comprises at least one additional sequence to increase expression of a GOI or the 
GOL 

14. The invention according to any one of the preceding claims wher^ the conjugate 
is stably integrated within the genome of a developing grain. 
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FIGURE 1 

-807 GATCTCGATGTGTAGTCTACGAGMGG" 

-780 GTTAACCGTCTCTTCGTGAGAATAACC6TGGCCTAAAAATAAGCCGATGAGGATAAATAA 

-720 AATGTGGTGGTACAGTACTTCAAGAGGmACTCATCWVGAGGATGCTTlTCCGATGAGC 

-660 TCTAGTAGTACATCGGACCTCACATACCTCCATTGTGGTGAAATATTTTG71GCTCATTTA 

-600 GTGATGGGTAAATmGTTTATGTCACTCTAGGTTnGACATTTCAGTmGCCACTCn 

-540 AGGimGACAAATAAmCCAmCGCGGCAAAAGCAAAACAATmATmACT^ 

-480 CCACTCTTAGCTTTCACAATGTATCACAAATGCCACTCTAGAAATTCTGTTTATGCCACA 

-420 GAATGTGAAAAAAAACACTCACTTATTTGAAGCCAAGGTGTTCATGGCATGGAAATGTGA 

-360 CATAAAGTAACGTTCGTGTATAAGAAAAAATTGTACTCCTCGTAACAAGAGACGGAAACA 

-300 TCATGAGACAATCGCGTTTGGAAGGCTTTGCATCACCTTTGGATGATGCGCAT6AATGGA 

-240 GTCGTCTGCTT6CTAGCCTTCGCCTACCGCCCACTGAGTCCGGGCG6C44ffWCCATC6G 

- 180 CGAACGACCCflff^TVACCTCTACCGACCGGACTTGAATGCGCTACCTTCGTCAGCGACGA 

- 120 TGGCCGCGTACGCTGGCGACGTGCCCCCGiMrCCflrfiGCGGCACATGGCGAGCTCAGACC 

- 60 GTGCGTQGCTGGCSfflTACGTACCCCGTGAGTGCCCTAGCTAGAAACTTACACCTGC 

1 AACTGC6AGAGCGAGCGTGTGAGTGTAGCCGA6TAGATCACC6TACGACGAC6ACGAGG 

60 GGCATGGCGATGGCGATGGGGATGGCGATGAGGAAGGAGGCAGCGGTGGCCGTGATGATG 

120 GTGATGGTGGTGACGCTGGCGGCGGGTGCGGACGCGGGAGCGGGAGCGGCGTGCGAGCCG 

180 GCGCAGCTGGCGGTGTGCGCGTCGGCGATCCTGGGCGGGACGAAGCCGAGCGGCGAGTGC 

240 TGCGGGAACCTGCGGGCGCAGCAGGGGTGCTTGTGCCAGTACGTCAAGGACCCCAACTAC 

300 GGGCACTACGTGAGCAGCCCACACGCGCGCGACACCCTCAACTTGTGCGGCATACCCGTA 

360 CCGCACTGCTAGCCGCCTAGCCGATCGAGGGCTCCAGGCACGCATGCATGTTCCTGTTAT 

420 G7GTATGTTGGAATAAAATGCTGGTGATCTATGGCGGCTAGCTTGCTTCCTGGCTAGCAG 

480 CTGCTGTAATGAAATTTGTGTTGCAACI 1 1 1 1 1 1 1 lAGTCC 
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FIGURE 2A 



-807 GATCTCGATGTGTAGTCTACGAGAA6G 

-780 GTTMCCGTCTCTTCGTGAGMTMCCGTGGCCTAAAAATAAGCC6ATGAGGATAAATAA 

-720 MTGTGGTGGTACAGTACTTCAAGAGGmACTCATCAAGAGGATGCTmCCGATGAGC 

-660 TCTAGTAGTACATCGGACCTCACATACCTCCATTGTGGT6/y\ATATTn6T6CTCATTTA 

-600 GTGATGGGTAAATTTTGmATGTCACTCTAGGTTTTGACAmCAGTm 

-540 AGGTrnGACAAATAATTrCCATTCCGCGGCAAAAGCAAAACAATmATmACTTTFA 

-480 CCACTCnAGCTTTCACAATGTATCACAAATGCCACTCTAGAAATTCTGTTTATGCCACA 

-420 GAATGTGAAAAAAAACACTCACTTATTTGAAGCCAAGGTGTTCATGGCATGGAAATGTGA 

-360 CATAAAGTAACG7TCGTGTATAAGAAAAAATTGTACTCCTCGTAACAAGAGACGGAAACA 

-300 TCT^TGAGACAATCGCGTTTGGAAGGCTTTGCATCACCTTTGGATGATGCGCATGAATGGA 

-240 6TCGTC71GCTTGCTAGCCTTCGCCTACCGCCCACTGAGTCCGGGCGGC(MCr4CCATCG6 

- 180 CGAACGACCCflffCrffACCTCTACCGACCGGACTTGAATGCGCTACCTTCGTCAGCGACGA 

-120 TGGCCGCGTACGCTGGCGACGTGCCCCCGftI7BCi17»GCGGCACATGGCGAGCTCAGACC 

-060 GTGCGTGGCTGGCD^CAMrACGTACCCCGTGAGTGCCCTAGCTAGAAACTTACACCTGC 
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FIGURE 2B 

-807 GATCTCGATGTGTAGTCTACGAGAAGG 

-780 GTTMCCGTCTCTTCGTGAGMTMCCGTGGCCTAAAAATAAGCCGATGAGGATAAATAA 

-720 MTGTGGTGGTACAGTACTTCAAGAGGmACTCATCAAGAGGATGCTTTTCCGATGAGC 

-660 TCTAGTAGTACATCGGACCTCACATACCTCCATTGT6GTGAAATATTTTGT6CTCATTTA 

-600 GTGATGGGTAMTmGmATGTCACTCTAGGTmGACAmCAGTTTTGCCACTCTT 

-540 AGGTTTTGACAAATAAmCCAmCGCGGCAAAAGCAAAACAATmATmACTTTTA 

-480 CCACTCmGCTTTCACAATGTATCACAAATGCCACTCTAGAAATTCTGTTTATGCCACA 

-420 GAATGTGAAAAAAAACACTCACTTATTTGAAGCCAAGGTG1TCAT6GCATGGAAATGTGA 

-360 CATAAAGTAACGTTCGTGTATAAGAAAAAATTGTACTCCTCGTAACAAGAGACGGAAACA 

-300 TCATGAGACAATCGCGmGGAAGGCTTTGCATCACCTTTGGATGATGCGCATGAATGGA 

-240 GTCGTCTGCTTGCTAGCCTTCGCCTACCGCCCACTGAGTCCGGGCGGCMCWCCATCGG 

- 180 CGAACGACC£MCC7»ACCTCTACCGACCGGACTTGAATGCGCTACCTTCGTCAGCGACGA 

- 120 TGGCCGCGTACGCTGGCGACGTGCCCCCGCfl TGCA 7BGCGGCACATGGCGAGCTCAGACC 

- 60 GTGCGTGGCTGGCESSarACGTACCCCGTGAGTGCCCTAGCTAGAAACTTACACCTGC 

1 AACTGCGA6AGCGAGCGTGTGAGTGTAGCCGAGTAGATC 
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FIGURE 4a- b 
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FIGURE 4c-e 
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FIGURE 7 



^ MYB MYC 
TAACTG CANNTG 
C G 

Ltp2 GG£M4CWCCATCGGC6MCGACCC«ffff7BACCTCTACC6ACCGGACTT6- gSnt- tTACAAAf 
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(57) Abstract 



An expression system for at least 
the aleurone cells of a developing cary- 
opsis or for at least the scutellar ^ithelial 
tissue or vascular tissue of a germinating 
seedling or developing grain or plant (e,g. 
in the root, leaves and stem) is described. 
The expression system comprises a gene 
promoter fused to a GOI (gene of inter- 
est). In a prefeired embodiment the 
pression system comprises the GOI fiised 
to a modified Ltpl gene promoter. 
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PROMOTER FROM A LIPID TRANSFER PROTEIN GBIE 
The inyc^ „^ „ , ^ ^ ^ ^ 

5 In particular pres™, tavendon rd,« «, tt« „« „f a p„„„,„ ^ 
of a sene of inURst (GOI) in a specific tissue or tissues of a pl^n. 

I^ote in particular the present invention relates to a modified promoter for a lipid 
»«sfe protein (Up) g«» Itno^n as U« Ltpl gene. Ite present invention also 
0 relates «> the application of this modified Ltpl gene promoter to express a GOI in a 
specific tissue or specific tissues of a plant. For example, expression can be in eiti,ar 
tiK aleurone layer or Uie scutellar epithelial layer of a monocotyledon, especially a 
transgenic cereal caryopsis (or grain), more especially a developing transgenic ceteal 
caryopsis (or grain). Particular examples inctade expression in tire scutellar epithelial 
. ass« or vascular tissue of a aansgenic rice plant, in particular in the vascular 
bundles and tip of emerging shoots aid roots, leaf veins and vascular bundles of 



20 



25 



30 



steins 



A chagranunatic iUustration of a developing caryopsis (or grain) is presented in Figure 
1, Which is discussed in detail later. In short, a typical developing caryopsis (or 
gram) comprises an endosperm component and an embryo component The 
endosperm, which is the site of deposition of different storage products such as starch 
and proteins, supports the growth of the emerging seedling during a short period of 
tmie after gennination. embryo gives rise to the vegetative plant H^e 

components and aspects are further discussed in Bosnes et al. 1992 and Olsen et al 



1992 



The embryo component can be divided into a scuteUum and an embryo axis lUe 
scuteUum can be subniivided into an epitheUal layer, which is usually one cell thick 
and an inner body of parenchyma cells. Likewise, the embryo axis can be sub^ 
divided into a root component and a shoot component. 
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The endosperm component of mature grains can be divided into a peripheral layer of 
living aleurone cells surrounding a central mass of non-living starchy endosperm 
cells. The aleurone layer in barley is three cells thick. During caiyopsis 
geimination. the cells of the aleurone layer produce amyolytic and proteolytic 
enzymes that degrade the storage compounds into metabolites that ai^ taken up and 
are used by the growing embryo. 

Two aspects of aleurone cell biology that have been intensively studied are the 
genetics of anthocyanin pigmentation of aleurone cells in maize (McCIintock, 1987) 
and the hormonal regulation of gene transcription in the aleurone layer of germinating 
barley caryopsis (Fincher, 1989). Using transposon tagging, several structural and 
regulatory genes in the anthocyanin synthesis pathway have been isolated and 
characterized (Paz-Aies etal., 1987; Dellaporta « a/. . 1988). In barley, a-amylase 
and ^-glucanase genes that are expressed both in the aleurone layer and embryos of 
mature germinating caryopsis have been identified (Karrcr et al. . 1991; Slakeski and 
Fincher. 1992). In addition, two other cDNAs representing transcripts that are 
differentially expressed in the aleurone layers of developing barley grains have been 
isolated. These are CHI26 (Lea et al.,l991) and pZE40 (Smith et al., 1992). 

None of these references discloses expression of those gene products in specific cell 
types of developing grains of transgenic cereal plants or in the scutellar epithelial 
tissue or vascular tissue of a germinating rice seedling or a developing rice grain or 
rice plant. 

In the life of a developmg caiyopsis (or grain), the embryo component of a dried 
caryopsis will imbibe water. The presence of water triggers the production of the 
hormone gibberellic acid in the embryo. In barley and other grass caryopsis. the 
embryo releases the gibberiUic acid which in turn causes expression of a number of 
genes in the aleurone layer of the endosperm resulting m the production of a number 
of enzymes such as a-amylases, proteases and |9-ghicanases. Similar enzymes are 
also produced by e;q>ression of genes in the epithelial layer. 
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These degradative enzymes digest certain components of the developing caryopsis (or 
grain) to fonn sugars and amino acids. 

For example, the a-amylases digest the starch store in the starchy endosperm, 
5 whereas the proteases digest the storage proteins and the i8-glucanases digest the cell 
walls. The resultant sugars and amino acids cross the epithelial layer and trigger 
growth of the shoot and root of the embryo axis - i.e. start the gennination process. 

In some cases it is desirable to transform seeds, grains, caryopsis and plants by 
10 introducing genes which, as a result of their expression, yield new or improved 
properties to the resulting transformed seeds, grains, caryopsis or plants. For 
example, it may be desirable to alter the expression levels of a natural structural gene 
which may be under- or over- expressed. It may even be desirable to reduce or 
eliminate a disease which harms or destroys the seed, gram, caryopsis or the plant. 

15 

It may even be desirable to make the seed, grain, caryopsis or the plant resistant to 
herbicides. It may even be desirable to prevent or to reduce the extent of pre-harvest 
sprouting. 

20 

It may even be desirable for the seed, grain, caryopsis, or plant to produce 
compounds useful for mammalian usage, such as human insulin. 

Some techniques are known for addressing some of those aims. 

25 

For example, the bacterium Agrobacteriwn tumefaciens has been used to introduce 
desired genes into the chromosome of a plant. For example the gene coding for 
EPSP synthase, a key enzyme in the synthesis of aromatic acids in plants, has been 
isolated and mtroduced into petunia plants under the control of a CaMV promoter 
30 (Shah et al, [1986]). The transgenic plants expressed increased levels of EPSP 
synthase in then- chloroplasts and were more tolerant to glycophophate - which 
inhibits production of EPSP synthase. 
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Other examples may be found in R.W. Old & S.B. Primrose (1993). Another use 
of Agrobacterium tumefaciens is described in De Silva et al. (1992) wherein a 
recombinant DNA construct is described containing a plant plastid specific promoter 
that expresses a gene placed under its control in concert with the fatty acid or lipid 
5 biosynthesis in the plant cell. 



10 



30 



PCT WO 90/01551 mentions the use of the aleurone cells of mature, germinating 
caryopsis to produce proteins from GOIs under the control of an a-amylase promoter. 
This promoter is active only in germinating caryopsis. 



Non-specific lipid transfer proteins (nsLtps) have the ability to mediate in vitro 
transfer of radiolabeled phospholipids from liposomal donor membranes to 
mitochondrial acceptor membranes (Kader et at.. 1984; Watanabe and Yamada, 
1986). Aldiough their in vivo function remains unclear, nsLTPs from plants have 
15 recently received much attention due to their recurrent isolation as cDNA clones 
representing developmentally regulated transcripts expressed in several different 
tissues. A common feature is that, at some point in their development, they are 
highly expressed in tissues producing an extracellular layer rich in lipids. 

20 In particular, transcripts corresponding to cDNAs encoding 10 kDa nsLTPs have been 
characterized in the tapetum cells of anthers as weU as the epidermal layers of leaves 
and shoots in tobacco (Koltunow et al., 1990; Heming et al., 1992), and barley 
aleurone layers (Mundy and Rogers, 1986; Jakobsen et al., 1989). 

25 In addition, a 10 kDa nsLTP has been discovered to be one of the proteins secreted 
from auxin-treated somatic carrot embryos into the tissue culture medium (Sterk et 
al., 1991). 



Based on in situ hybridisation data demonstrating that the Ltp transcripts are localized 
in the protoderm ceUs of the somatic and zygotic carrot embryo, it was suggested that 
in vivo nsLTPs are involved in either cutin biosynthesis or m the biogenesis and 
degradation of storage lipids (Sossountzov et al., 1991; Sterk et al., 1991). 
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A nsLTP in Arabidopsis has been localized to the cell walls lending further support 
to an extracellular function of this class of proteins (Thoma et aL, 1993). 

Recently, using a standard in vitro Ltp assay, two 10 kDa nsLtps and one member 
of a novel class of 7 kDa nsLtp's were isolated from wheat seeds (Monnet, 1990; 
Dieryck et aL, 1992). 



The sequence of this 7 kDa wheat nsLtp protein shows a high degree of similarity 
with the predicted protein from the open reading frame (ORF) of the BzllE cDNA, 

10 which had been isolated in a differential screening for barley aleurone specific 
transcripts (Jakobsen et aL, 1989). However, the amino acid sequence of this 
polypeptide showed only limited sequence identities with the previously sequenced 10 
kDa proteins. In sub-cellular localisation studies using gold labelled antibodies one 
10 kDa protein from Arabidopsis was localised to the cell wall of epidermal leaf cells. 

15 The presence of a signal peptide domain in the N*terminus of the open reading frames 
of all characterised plant nsLtp cDNAs, also suggests that these are proteins destined 
for the secretory pathway with a possible extracellular function. 

Olsen et al. in a paper titled "Molecular Strategies For Improving Pre-Harvest 
20 Sprouting Resistance In Cereals" published in 1990 in the published extracts from the 
Fifth International Symposium On Pre-Harvest Sprouting In Cereals (Westview Press 
Inc.) describe three different strategies for expressing different "effector" genes in the 
aleurone layer in developing grains of transgenic plants. This document menidons 4 
promoter systems - including a system called BllE. 

25 

Kalla et al (1993) in a paper titled "Characterisation of Promoter Elements Of 
Aleurone Specific Genes From Barley" describe the possibility of the egression of 
anti-sense genes by the use of promoters of the aleurone genes B22E, B23D, B14D, 
and BllE. 



30 
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Linnestad et al. (1991) describe the isolation and sequencing of the Ltpl gene and 
disclose a 787 base pair fragment of the Ltpl gene promoter fiised to a fragment of 
the Ltpl structural gene. This paper does not disclose any expression studies using 
the 787 base pair fragment. 

5 

Skriver et al (1992) report further on the Ltpl gene. This paper says that the Ltpl 
gene promoter is only aleurone specific. To confirm this submission the paper 
further reports on the isolation and fusion of a 769 bp fragment (-702 to +67 bp) of 
the gene to the bacterial /8-glucuronidase {GUS) reporter gene. This fragment 
10 therefore contains 635 bp of the Ltpl gene promoter. Subsequent transient expression 
studies showed that the shortened gene promoter resulted only in aleurone specific 
expression. Expression was not observed in any other tissue. The authors conclude 
that there are sequences between the -702 and +67 bp of Ltpl which contain DNA 
elements that specifically modulate its transcription in aleurone cells. 

15 

One of the major limitations to the molecular breeding of new types of crop plants 
with specific ceils expressing GOIs is the lack of a suitable tissue specific promoter. 
In particular, there is a lack of a tissue specific promoter that leads to expression of 
a GOI in a developing caryopsis (or grain) or in a germinating rice seedling or in a 
20 developing grain, in particular in the scutellar epithelial tissue or vascular tissue of * 
a germinating seedling or a developing grain or a plant. 

Moreover, all of the available promoters - such as the CaMV 3SS, rice actin and 
maize alcohol dehydrogenase - are constitutive, i.e. they are fairly non-specific in 
25 target site or stage development as they drive expression in most cell types in the 
plants. 

Hence, another problem that arises is how to achieve expression of a product coded 
for by a GOI in a specific tissue that gives minimal interference with the developing 
30 embryo and seedling. 
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Our co-pending United Kingdom patent application (GB 9324707.0) describes the use 
of an Ltp2 gene promoter for expression of a GOI in the aleurone layer. However, 
in spite of this teaching, there is still a need for other tissue specific promoters, such 
as another aleurone specific promoter or, preferably, a promoter specific for vascular 
5 tissue and/or the scutellar epithelial layer. In this regard, it is still desirable to 
provide other tissue specific expression of GOIs in cereals such as rice, maize, wheat, 
barley and other transgenic cereal plants. Moreover it is desirable to provide tissue 
specific expression that does not detrimentally affect the developing embryo and the 
developing caryopsis (or grain). 

10 

According to a first aspect of the present invention there is provided a modified Ltpl 
gene promoter which is integrated, preferably stably integrated, within a plant 
material's genomic DNA and which is capable of inducing expression of a GOI when 
fused to the gene promoter in at least the aleurone cells or in at least the scutellar 
IS epithelial tissue or vascular tissue of a germinating seedling or a developing grain or 
a plant (e.g. in the root, leaves and stem). 

According to a second aspect of the present invetion there is provided a modified 
Ltpl gene promoter according to claim 1 or claim 2 wherein the promoter comprises 
20 the nucleic acid sequence shown as SEQ. I.D. 1, or a sequence that has substantial 
homology with that of SEQ. I.D. 1, or a variant thereof. 

According to a third aspect of the present invetion there is provided an isolated Ltpl 
gene promoter comprising the sequence shown as SEQ. I.D. 1, or a sequence that has 
25 substantial homology therewith, or a variant thereof. 

According to a fourth aspect of the present invetion there is provided a construct 
comprising a GOI and a modified Ltpl gene promoter according to the present 
invention; wherein the construct is capable of being expressed in at least the aleurone 
30 cells or in at least the scutellar epithelial tissue or vascular tissue of a plant material; 
and wherein if there is expression in just the aleurone layer of a developing barley 
caryopsis then the fused promoter and GOI are not the 769 bp fragment of Skriver 
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et al (1992). 

According to a fifth aspect of the present invetion there is provided an expression 
system for at least the aleurone cells or for at least the scutellar epithelial tissue or 
5 vascular tissue of a plant material, the expression system comprising a GOI ftised to 
a modified Ltpl gene promoter wherein the expression system is capable of being 
expressed in at least the aleurone cells or in at least the scutellar epithelial tissue or 
vascular tissue of the plant material; and wherein if there is expression in just the 
aleurone layer of a developing barley caryopsis then the fused promoter and GOI are 
10 not the 769 bp firagment of Skriver et al (1992). 

According to a sixth aspect of the present invetion there is provided an expression 
system for at least the aleurone cells of a developing caryopsis or for at least the 
scutellar epithelial tissue or vascular tissue of a germinating seedling or developing 

15 grain or plant (e.g. in the root, leaves and stem), the expression system comprising 
a gene promoter fused to a GOI wherein the expression system is capable of being 
expressed in at least the aleurone cells of the developing caryopsis or in at least the 
scutellar epithelial tisisue or vascular tissue of a germinating seedling or a developing 
grain or a plant (e.g. in the root, leaves and stem); either wherein if there is 

20 expression in just the aleurone layer of a developing barley caryopsis then either the 
promoter is not the wild Q^pe Ltpl promoter in its natural enviroment and the GOI 
is not the Ltpl functional gene in its natural enviroment; or wherein if there is 
expression in just the aleurone layer of a developing caryopsis then the fused 
promoter and GOI are not the 769 bp firagment of Skriver et al (1992). 

25 

According to a seventh aspect of the present invetion there is provided a transgenic 
cereal comprising an expression system according to the present invention or a 
construct according to the present invention wherein the expression system or 
construct is integrated, preferably stably integrated, within the cereal's genomic 
30 DNA. 
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According to an eighth aspect of the present invetion there is provided the use of a 
gene promoter according to the present invention to induce expression of a GOI when 
fused to the gene promoter in at least the aieurone cells or in at least the scutellar 
epithelial tissue or vascular tissue of a plant material. 

5 

According to a ninth aspect of the present invetion there is provided a process of 
expressing a GOI when fused to a gene promoter according to the present invention 
wherein expression occurs in at least the aieurone cells or in at least the scutellar 
epithelial tissue or vascular tissue of a plant material. 

10 

According to a tenth aspect of the present invetion there is provided a process of 
expressing in at least the scutellar epithelial tissue or vascular tissue of a developing 
grain or a germinating seedling or a plant, preferably a developing rice grain or a 
germinating rice seedling or a transgenic rice plant, an expression system according 
15 to teh present invention or a construct according to the present invention wherein the 
expression system or construct is integrated, preferably stably integrated, within the 
cereal's genomic DNA. 

According to an eleventh aspect of the present invetion there is provided a 
20 combination expression system comprising a. as a first construct, a construct 
according to the present invention; and b. as a second construct, a construct 
comprising a GOI and another gene promoter that is tissue- or stage-specific. 

According to a twelfth aspect of the present invetion there is provided a developing 
25 cereal grain, preferably a germinating rice seedling, comprising any one of: a 
promoter according to the present invention, an expression system according to the 
present invention, a construct according to the present invention, or a combination 
expression system according to the present invention. 

30 According to a thirteenth aspect of the present invetion there is provided plasmid 
NCIMB 40609. 
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Preferably the plant material is a developing caryopsis, a germinating seedling, a 
developing grain or a plant. 

Preferably the construct is capable of being expressed in at least the aleurone cells of 
5 a developing caryopsis or in at least the scutellar epithelial tissue or vascular tissue 
of a germinating seedling or a developing grain or a plant when the construct is 
integrated, preferably stably integrated, within the caryopsis's or grain's or seedling's 
or plant's genomic DNA. 

10 Preferably the modified Ltpl gene promoter comprises the nucleic acid sequence 
shown as SEQ. I.D. 1, or a sequence that has substantial homology with that of SEQ. 
I.D. 1, or a variant thereof. 

Preferably the construct further comprises at least one additional sequence to increase 
15 expression of the GOI, 

Preferably the expression system is for at least the aleurone cells of a developing 
caryopsis or for at least the scutellar epithelial tissue or vascular tissue of a 
germinating seedling or developing grain or plant (e.g. in the root, leaves and stem). 

20 

Preferably the expression system is additionally capable of being expressed in the 
embryo cells of the germinating grain or the plantlet. 

Preferably the expression system is integrated, preferably stably integrated, within a 
25 developing caryopsis's genomic DNA or a germinating seedling's genomic DNA or 
a developing grain's genomic DNA or a plant's genomic DNA. 

Preferably, in the expression system, the gene promoter comprises the sequence 
shown as SEQ I.D. No. 1 or comprises a sequence that has substantial homology 
30 therewith, or is a variant thereof. 
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Preferably, the expression system comprises the construct according to the present 
invention. 

Preferably, in the use, the gene promoter is used to induce expression of a GOI when 
5 fused to the gene promoter in at least the aleurone cells of a developing caryopsis or 
in at least the scutellar epithelial tissue or vascular tissue of a germinating seedling 
or a developing grain or a plant (e.g. in the root, leaves and stem). 

Preferably, the gene promoter expresses the GOI when fused to the gene promoter 
10 in at least the aleurone cells of a developing caryopsis or in at least the scutellar 
epithelial tissue or vascular tissue of a germinating seedling or a developing grain or 
a plant (e.g. m the root, leaves and stem). 

Preferably the promoter and GOI are integrated, preferably stably integrated, within 
15 a cereal's genomic DNA. 

Preferably the gene promoter is a fragment of a barley iJpl gene promoter. 

Preferably the promoter is for a 10 kDa lipid transfer protein. 

20 

Preferably the gene promoter is obtainable from plasmid NCIMB 40609. 

Preferably the gene promoter is used for expression of a GOI in a cereal caryopsis 
or a cereal grain or a cereal seedling or a cereal plant. 

25 

Preferably the cereal caryopsis is a developing cereal caryopsis, the cereal grain is 
a developing cereal grain, and the cereal seedling is a germinating cereal seedling. 



30 



Preferably the cereal is any one of a rice, maize, wheat, or barley. 
Preferably the cereal is rice or maize. 
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Preferably the developing caryopsis is a developing barley caryopsis, the germinating 
seedling is a germinating rice seedling, the developing grain is a developing rice 
grain, and the plant is a transgenic rice plant. 

5 Preferably in the combination expression system each construct is integrated, 
preferably stably integrated, within a plant material. 

Preferably each of the myb site and the myc site in the gene promoter is maintained 
substantially intact. 

10 

Preferably the gene promoter is integrated, preferably stably integrated, in the 
developing caryopsis *s genomic DNA or the germinating seedling's genomic DNA 
or the developing grain's genomic DNA or the plant's genomic DNA and which is 
capable of inducing expression of a GOI when fused to the gene promoter in at least 
15 the aleurone cells of the developing caryopsis or in at least the scutellar epithelial 
tissue or vascular tissue of a germinating seedling or a developing grain or a plant 
(e.g. in the root, leaves and stem). 

Preferably the transgenic developing caryopsis, germinating seedling, developing 
20 grain or plant is prepared by stable integration of the GOI and the gene promoter to 
form a stable transgenic plant. This ensures aleurone or epithelial or vascular 
expression at, at least, the developing caryopsis stage. One preferred method for 
achieving this includes preparing the transgenic developing caryospis, gemiinating 
seedling, developing grain or plant by stable integration of the GOI and the gene 
25 promoter at the protoplast level. 

Preferably the promoter is used for expression of a GOI in a monocotyledonous 
species, including a grass - preferably a transgenic cereal grain or caryopsis. 
Preferably the gene promoter is used for expression of a GOI in a cereal grain or 
30 caryopsis. Preferably the cereal grain or caryopsis is a developing cereal grain or 
caryopsis. Preferably the cereal grain or caryopsis is any one of a rice, maize, 
wheat, or barley grain or caryopsis. 
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Preferably the cereal grain is a rice grain. 

Preferably the DNA sequence for the modified Ltpl gene promoter is the nucleic acid 
sequence shown as SEQ. I.D. 1. 

5 

Preferably in the combination expression system each construct is integrated, 
preferably stably integrated, within a developing caryopsis*s genomic DNA or a 
grain's genomic DNA or a seedling's genomic DNA or a plant's genomic DNA. 

10 Preferably, in the combination expression system, the first construct comprises the 
modified Ltpl gene promoter according to the present invention. 

Preferably, the promoter in the second construct is an aleurone specific promoter. 

15 Preferably the promoter in the second construct a barley promoter. 

Preferably the second construct is the B22E gene promoter. 

Preferably the promoter in the second construct is the Ltp2 gene promoter. 

20 

Preferably the promoter in the second construct is for a 7 kDa lipid transfer protein. 

Preferably the promoter in the second construct is the promoter for Ltp2 of Hbrdeum 
vulgare. 

25 

Preferably the promoter in the second construct comprises the sequence shown as 
SEQ. I.D. 2, or a sequence that has substantial homology therewith, or a variant 
thereof. 



30 



Preferably each of the myb site and the myc site in the L^2 gene promoter is 
maintained substantially intact. 
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Preferably the second construct further comprises at least one additional sequence to 
increase expression of the GOI. 



Preferably, in the combination expression system, the grain or caryopsis is as defined 
5 above for the present invention. 



Preferably the gene promoter is obtainable from plasmid NCIMB 40609. 

A preferred embodiment of the present invention is a modified Ltpl gene promoter 
10 which is integrated, preferably stably integrated, within a plant material's genomic 
DNA and which is capable of inducing expression of a GOI when fused to the gene 
promoter in at least the aleurone cells or in at least the scutellar epithelial tissue or 
vascular tissue of a germinating seedling or a developing grain or a plant (e.g. in the 
root, leaves and stem), but wherein if there is expression in just the aleurone layer 
15 of a developing seed then the fused promoter and GOI are not the 769 bp fragment 
of Skriver et al (1992). 

An even more preferred embodiment of the present invention is a modified Ltpl gene 
promoter which is integrated, preferably stably integrated, within a plant material's 

20 genomic DNA and which is capable of inducing expression of a GOI when fused to 
the gene promoter in at least the aleurone cells or in at least the scutellar epithelial 
tissue or vascular tissue of a germinating seedling or a developing grain or a plant 
(e.g. in the root, leaves and stem), wherein the promoter comprises the nucleic acid 
sequence shown as SEQ. I.D. 1, or a sequence that has substantial homology with 

25 that of SEQ. I.D. 1, or a variant thereof. 

As a highly preferred embodiment, the present invention therefore provides transgenic 
rice comprising a construct comprising a GOI fused to a modified Ltpl gene 
promoter; wherein the construct is integrated, preferably stably integrated, within the 
30 rice's genomic DNA, and wherein the GOI is expressed in at least the vascular tissue 
and/or scutellar epithelial layer of a germinating rice seedling or a developing rice 
grain or a rice plant. 
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In a more preferred embodiment the present invention provides a transgenic rice 
seedling, grain or plant comprising a construct comprising a GOI fused to a modified 
Ltpl gene promoter, wherein the construct is integrated, preferably stably integrated, 
within the rice's genomic DNA; wherein the GOI is expressed in at least the scutellar 
5 epithelial tissue or vascular tissue of a germinating seedling or a developing grain or 
a plant, and wherein the modified Ltpl gene promoter comprises the nucleic acid 
sequence shown as SEQ. LD. 1, or a sequence that has substantial homology with 
that of SEQ. LD. 1, or a variant thereof. 

10 The additional sequence(s) for the construct(s) for increasing the expression of the 
GOI(s) may be one or more repeats (e.g. tandem repeats) of the promoter upstream 
box(es) which are responsible for the aleurone layer or scutellar epithelial cell and/or 
vascular expression pattern of the modified Ltpl gene promoter. The additional 
sequence may even be a S'Ai-intron. 

15 

The term "plant material" includes a developing caryopsis, a germinating caryopsis 
or grain, or a seedling, a plantlet or a plant, or tissues or cells thereof, such as the 
aleurone cells of a developing caryopsis or the scutellar epithelial tissue or vascular 
tissue of a germinating seedling or developing grain or plant (e.g. in the root, leaves 
20 and stem). 

Thus a preferred aspect of the present invention comprises plant material comprising 
a GOI and a modified Ltpl gene promoter which is capable of inducing expression 
of the GOI when fiised to the gene promoter in at least the aleiurone cells or in at least 

25 the scutellar epithelial tissue or vascular tissue of the plant material; wherein the 
construct is capable of being e^qpressed in at least the aleiurone cells or in at least the 
scutellar epithelial tissue or vascular tissue of the plant material, when the construct 
is integrated, preferably stably integrated, within the caryopsis*s or grain's or 
seedling's or plant's genomic DNA; and wherein the modified Ltpl gene promoter 

30 comprises the nucleic acid sequence shown as SEQ. LD. 1, or a sequence that has 
substantial homology with that of SEQ. LD. 1, or a variant thereof. 
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The term "modified" with reference to the present invention means any Ltpl gene 
promoter that is different to the wild type promoter but wherein the promoter induces 
expression in at least the aleurone cells of a developing caryopsis or in at least the 
scutellar epithelial tissue or vascular tissue of a germinating seedling or a developing 
5 grain or a plant (e.g. in the root, leaves and stem). 

In particular, a preferred modified Ltpl gene promoter is a shortened wild type Ltpl 
gene promoter but wherein the promoter induces expression in at least the aleurone 
cells of a developing caryopsis or in at least the scutellar epithelial tissue or vascular 
10 tissue of a germinating seedling or a developing grain or a plant (e.g. in the root, 
leaves and stem). 

The term "transgenic" in relation to the present invention - in particular in relation 
to the developing caryopsis, germinating seedlings, developing grains and plants of 

15 the present invention - does not include a wild type promoter in its natural enviroment 
in combination with its associated functional gene (GOI) in its natural enviroment. 
Thus, the term includes developing caryopsis or seedlings or grains or plants 
incorporating the GOI which may be natural or non-natural to the grain or caryopsis 
or seedling or grain or plant in question operatively linked to the modified Ltpl 

20 promoter of the present invention. 

The term "GOI" with reference to the present invention means any gene of interest. 
A GOI can be any gene that is either foreign or natural to the cereal in question, 
except for the wild type Ltpl functional gene when in its natural enviroment. In the 
25 combination expression system the GOI may be the same or different. 

Tjrpical examples of a GOI include genes encoding for proteins and enzymes that 
modify metabolic and catabolic processes. For example, the GOI may be a protein 
giving added nutritional value to the grain or caryopsis as a food or crop. Typical 
30 examples include plant proteins that can inhibit the formation of anti-nutritive factors 
and plant proteins that have a more desirable amino acid composition (e.g. a higher 
lysine content than the non-transgenic plant). 
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The GOI may even code for an enzyme that can be used in food processing such as 
chymosin, thaumatin, a-galactosidase and guar. 

In a preferred embodiment, particularly with vascular expression, the GOI may code 
5 for an agent for introducing or increasing pathogen resistance. 

The GOI may even be an antisense construct for modifying the expression of natural 
transcripts present in the relevant tissues. 

10 The GOI may even code for a non-natural plant compound that is of benefit to 
animals or humans. For example, the GOI could code for a pharmaceutically active 
protein or enzyme such as the therapeutic compounds insulin, interferon, human 
serum albunoin, human growth factor and blood clotting factors. In this regard, the 
transformed cereal grain or caryopsis could prepare acceptable quantities of the 

15 desired compound which could be easily retrievable firom the scutellar epithelial layer, 
the aleurone layer or the vascular tissue. 

Preferably the GOI is a gene encoding for any one of a protein having a high 
nutritional value, a Bacillus thuringensis insect toxin, an a- or /3- amylase antisense 
20 transcript, a protease antisense transcript, or a glucanase antisense transcript. 

The term "a variant thereof with reference to the present invention means any 
substitution of, variation of, modification of, replacement of, deletion of or the 
addition of one or more nucleic acid(s) from or to the promoter sequence providing 
25 the resultant sequence exhibits at least aleurone, scutellar epithelial or vascular 
expression, respectively. The term also includes sequences that can substantially 
hybridise to the promoter sequence. 

The term "substantial homology" covers homology with respect to at least the 
30 essential nucleic acids/nucleic acid residues of the promoter sequence providing the 
homologous sequence acts as a promoter, e.g. as a promoter for at least aleurone 
expression in a developing caryopsis or for at least scutellar epithelial tissue or 
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vascular tissue expression in a germinating seedling or in a developing grain or plant. 
Preferably there is at least about 80% homology, more preferably at least about 90% 
homology, and even more preferably there is at least about 95% homology with the 
promoter sequence shown as SEQ. I.D. No. 1. or SEQ. I.D. No. 2, respectively. 

5 

The term "maintained substantially intact" means that at least the essential 
components of each of the myb site and the myc site remain in the construct to ensure 
acceptable expression of a GOI. Preferably at least about 75%, more preferably at 
least about 90%, and even more preferably there is at least about 95%, of the myb 
10 or myc site is left intact. 

The term "construct" - which is synonymous with terms such as "cassette", "hybrid" 
and "conjugate" - includes a GOI directly or indirectly attached to the modified gene 
promoter, such as to form a [modified Ltpl gene promoter-GOI] cassette. An 
15 example of an indirect attachment is the provision of a suitable spacer group such as 
an intron sequence, such as the 5/zi-intron, intermediate the promoter and the GOL 
The same is true for the term "fused" in relation to the present invention which 
includes direct or indirect attachment. 

20 The term "expression system" means that the system defined above can be expressed ^ 
in an appropiate organism, tissue, cell or medium. In this regard, the expression 
system of the present invention may comprise additional components that ensure ro 
increase the expression of the GOI by use of the gene promoter. 

25 As indicated above, the expression system of the present invention can also be used 
in conjimction with another expression system, preferably an expression system that 
is also tissue and/or stage specific. 

For example, the construct comprising the modified Ltpl gene promoter (e.g. the 787 
30 bp fragment of SEQ. I.D. NO. 1) can be used in conjunction with a construct 
comprising the lxp2 gene promoter (e.g. SEQ. I.D. NO. 2) - which is the subject of 
our co-pending UK patent application GB 9324707.0. 
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In this respect, and with reference to barley, in the early stages of developing 
caryopsis the modified Ltpl gene promoter affects expression of a GOI in at least the 
aleurone layers of developing caryopsis. This expression can then be complimented 
by use of the Ltp2 gene which can express a GOI (which may be the same or 
5 different as that operatively linked to the modified Ltpl gene promoter) in high levels 
in the aleurone layer of developing grains. 

However, the combination expression system is very effective for transgenic rice. 
In this respect, in the early stages of developing caryopsis the modified Ltpl gene 

10 promoter expresses a GOI in the scutellar epithelial layer and the vascular tissue. 
This expression can then be complimented by use of the Ltp2 gene which can express 
a GOI in high levels in the aleurone layer of developing grains. This combination is 
particularly advantageous for pre-harvest sprouting when the first response is 
production of a-amylase in the scutellar epithelium cells as this can be reduced or 

15 prevented by placing an anti-sense a amylase gene under the control of the Ltpl 
promoter. In this system, the expression of antisense a-amylase would block the 
synthesis of a-amylases in the scutellum epithelial cells - where they are first made. 
The same or another GOI could be expressed in the aleurone layer via the Ltp2 gene 
promoter. 

20 

The construct comprising the modified Ltpl gene promoter may even be used in 
conjunction with a construct comprising the B22E gene promoter - details of which 
may be found in Olsen et al. (1990) and Klemsdal et al, (1991). This gene 
promoter, which is expressed in immature aleurone layers, has been shown by 

25 particle bombardment experiments to be capable of driving Gus expression in 
developing barley grains. Also, using Northern analysis, as well as in situ 
hybridization, it has been shown that the B22E cDNA probe hybridizes to transcripts 
in the aleurone layer and in the scutellum parenchyma cells and the provascular 
bundle of the embryo axis in developing barley grains. In addition, a hybridizing 

30 transcript is also present in the ventral vascular strand of developing caryopsis (Olsen 
etal., 1990). 
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We have also found that by using a 4.6 kb B22E promoter fragment contained on a 
Xbal-CIal fragment of a genomic clone fused to the Gus reporter gene transformed 
rice plants could be prepared. Those transformed rice plants exhibited strong 
expression in the vascular tissue (phloem) of the ventral strand of the developing rice 
5 grain. This expression pattern was completely unexpected in view of Klemsdal et al 
(1991). Expression, although weaker, in the same cell type was also observable in 
the stem of young shoots. Thus, using the B22E promoter, a GOI transcript can be 
expressed in the aleurone layers of developing grains, the parenchyma cells of the 
embryonic scutellum and the ventral vascular bundle of developing grains. 

10 

The combination of the use of the modified Ltpl gene promoter and the B22E gene 
promoter could even include the use of another gene promoter, such as the Ltp2 gene 
promoter, to express three GOIs respectively wherein each GOI may be the same or 
different. 

15 

One or more of the other expression systems to be used in conjunction with the 
modified Ltpl gene promoter expression system may be contained in or on the same 
transmission vector - such as in the same transforming baterium or even in the same 
plasmid. The advantage of this is that each expression system can then be delivered 
20 at the same time. The respective expression systems will then be tumed on during the 
relevant life time of the grain or caryopsis or the plantlet or the mature plant. 

The present invention therefore provides the novel and inventive use of a promoter 
which can express a GOI in at least the aleurone cells of a developing caryopsis or 
25 in at least the scutellar epithelial tissue or vascular tissue of a germinating seedling 
or a developing grain or a plant (e.g. in the root, leaves and stem). In a preferred 
embodiment the present invention relates to the use of a modified Ltpl gene 
promoter, preferably the Ltpl gene promoter is obtainable from barley. 

30 The main advantage of the present invention is that the use of the modified Ltpl gene 
promoter results in expression of a GOI in at least the aleurone layer of at least a 
developing caryopsis, such as a developing barley caryopsis, or in at least the 
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scutellar epithelial tissue or vascular tissue of a germinating seedling or a developing 
grain or plant of cereals such as rice, maize, wheat or other transgenic cereal grain 
or caryopsis, preferably a developing rice grain. 

5 Another advantage is that, depending on the type of GOI, the expressed products can 
be stable in vivo. Hence over a period of time high levels of the expressed product 
can accumulate in the aleurone cells or in epithelial cells or in the vascular tissue. 

A further advantage is that the expression of the product coded for by a GOI in the 
10 aleurone layer or the epithelial layer or the vascular tissue has minimal interference 
with the developing embryo and seedling. This is in direct contrast to known 
constitutive promoters which give high levels of expression in the developing seedling 
and mature plant tissues which severely affect normal plant development. Thus the 
present invention is particularly useful for expressing a GOI in at least the aleurone 
15 layer of a developing caryopsis or in at least the scutellar epithelial tissue or in the 
scutellar epithelial tissue or vascular tissue of a germinating seedling or a developing 
grain or plant - such as cereal grains or caryopsis - and in doing so not detrimentaly 
affect the caryopsis, seedling, grain or plant. 

20 With regard to the first aspect of the present invention it is to be noted that this is the 
first reported case for the specific expression of a GOI in the scutellar epithelial cells 
or vascular cells of a transformed developing cereal grain such as rice. 

With regard to some aspects of the present invention, it is to be noted that up until 
25 now it was believed that the wild type Ltpl gene promoter or a specific varaint 
thereof when fused to at least a segment of the Ltpl fiicntional gene would lead only 
to expression in the aleurone layer. For exaniple see the teachings of Skriver et at. 
(1992). However, with the present invention, we have now surprisingly foimd that 
this is not the case and it is now possible to modify the L^l gene promoter to lead 
30 to a pronounced expression in at least the aleurone layer or in at least the scutellar 
epithelial layer or vascular tissue of a plant material. 
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In one embodiment the plant material is barley plant material. In another embodiment 
the plant material is not barley plant material. In a preferred embodiment the plant 
material is rice plant material. In an alternative preferred embodiment the plant 
material is maize plant material. 

5 

In a germinating, transgenic barley caryopsis according to the present invention, there 
is expression in the aleurone layer. 

In a germinating, transgenic rice seedling according to the present invention there is 
10 pronounced expression in the scuteliar epithelial tissue and vascular tissue. 

As indicated, the expression pattern for the present invention is particularly surprising 
as it was completely unexpected that a modified Ltpl gene promoter could result in 
expression of a GOI, such as a plant functional gene, in the aleurone cells of, for 

15 example, barley or in the scuteliar epithelial tissue or vascular tissue of a germinating 
seedling or a developing grain or plant of rice (see experimental section later). The 
findings of the present invention are also surprisingly different to the work of Skxiver 
et al. (1992) who, as mentioned above, report that the Ltpl gene promoter and a 
shortened version thereof when fused to the functional Ltpl functional gene only 

20 result in aleurone specific expression in barley - i.e. expression is not observed in any 
other tissue in barley or even other cereals. 

In order to prepare the transgenic organism according to the present invention, the 
modified Ltpl gene promoter may be initially inserted into a plasmid. For example, 
25 the Sacl-BcH L^l gene promoter fragment can be inserted into the Sacl-BamHL site 
of Bluescript. A GOI, such as GUS, can then be inserted into this construct. 
Furthermore, a 5/tl intron can then be inserted into the SnuA site of this construct. 

Stable integration into protoplasts may be achieved by using the method of Shimamoto 
30 (1989). Another way is by bombardment of an embryonic suspension of cells (e.g. 
rice, barley or maize cells). A further way is by bombardment of inunature embryos 
(e.g. rice, maize or barley embryos). 
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With regard to the present invention, it is shown by using particle bombardments that 
the modified Ltpl gene promoter, such as the 787 bp fragment of the attached 
sequence, when fiised to a jS-glucuronidase (GUS) reporter gene, which serves as a 
GOI for the purposes of this invention, acts as a promoter for expression of GUS in 
5 a specific tissue type or specific tissue types. For example, GUS expression can be 
achieved in the aleurone cells of developing cereal caryopsis or grain, in particular 
developing barley caryopsis, or in the scutellar epithelial tissue or vascular tissue of 
a germinating seedling or a developing grain or plant, in particular developing rice 
grain or germinating seedlings. 

10 

In particular, in transgenic rice plants, the modified barley Ltpl gene promoter directs 
strong expression of the GC/5-reporter gene in the scutellar epithelial layer and the 
vascular tissue of the developing caryopsis. This expression can continue through 
into the germinating grains. The surprising finding is that very pronounced 
IS expression can be seen in the scutellar epithelial tissue or vascular tissue of a 
developing rice grain or germinating rice seedlings. Other examples include 
expression in the vascular bundles and tip of emerging shoots and roots, leaf veins 
and vascular bundles of stems. 

20 Generally therefore the present invention relates to a modified promoter for a Ltpl 
gene encoding a 10 kDa nsLTP. In the present invention, a genomic clone was 
isolated using the cDNA insert of previously isolated cDNA clone and characterised 
by DNA sequencing (see discussion later). The sequence of the cDNA and isolated 
genomic clone was found to be identical in the overlapping region. It was fotmd the 

25 Ltpl gene contains one intron (see discussion later). 

By comparing the DNA sequence of the active promoter sequences two putative cis- 
acting elements with the potential of binding known transcriptional factors present in 
cereals were detected. They include the binding sites for transcriptional factors of the 
30 myb and myc class, namely TAACTG and CANNTG respectively. Our studies 
showed that high levels of expression are achieved when the myb and myc sites are 
left intact. 
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In the present invention, mature fertile rice plants were regenerated from transformed 
cultured rice protoplasts. The developing caryopsis of these primary transformants 
were analysed for the expression of GUS, It was found that the modifled barley Ltpl 
gene promoter confers some expression in the aleurone layer of the transgenic rice 
5 plants. However, pronounced expression was observed in the scutellar epithelial 
tissue or vascular tissue of germinating rice seedlings or developing transgenic rice 
grain or transgenic rice plants. This is the first example of such patterns of 
expression in transgenic rice plants. 

10 The following sample has been deposited in accordance with the Budapest Treaty at 
the recognised depositary The National Collections of Industrial and Marine Bacteria 
Limited (NCIMB) at 23 St Machar Drive, Aberdeen, Scotland, AB2 IRY, United 
Kingdom, on 11 January 1994: 

15 An E. Coll K12 bacterial stock containing the plasmid pLtpl .787- 

GN - i.e. Bluescript containing a 787 bp fragment of the barley 
Ltpl gene promoter (Deposit Number NCIMB 40609). 

The plasmid pLtpl.787-GN is shown pictorially in Figure 6 (see later). 

20 

The modified Ltpl gene promoter can be isolated from this plasmid through the use 
of appropriate PGR primers, which may be easily constructed from the data from the 
shown sequences. 

25 Other embodiments and aspects of the present invention include: 

A transformed host having the capability of expressing a GOI in the 
aleurone layer or the scutellar epithelial layer or the vascular tissue through 
the use of the gene promoter as hereinbefore described; 

30 

A vector incorporating a construct as hereinbefore described or any part 
thereof; 
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A plasmid comprising a construct as hereinbefore described or any part 
thereof; 

A cellular organism or cell line transformed with such a vector; 

5 

A monocotylenedonous plant comprising any one of the same; 

A developing caryopsis or grain or germinating seedling comprising any of 
the same; and 



10 



A method of expressing any one of the above. 



The present invention will now be described only by way of examples in which 
reference shall be made to the accompanying Figures in which: 

15 

Figure 1 is a diagrammatic illustration of the structural components of a 
developing caryopsis; 

Figure 2 shows the results for an in situ hybridization experiment for a wild ^e 
20 Ltpl gene promoter in barley; 

Figure 3 is a nucleotide sequence of part of the wild type Lxpl gene taken from 
Linnestad et aL (1991); 

25 Figure 4 is a nucleotide sequence of part of the wild type Ltpl gene taken from 

Skriver et aL (1992); 

Figure 5 is a nucleotide sequence of a 787 bp fragment of the wild type Ltpl 
gene promoter; 



30 



Figure 6 is a linear map of the Ltpl.787-GN construct showing additional 
sequence information; 
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Figure 7 is a circular map of the piasmid pLtpl .787-GN containing the Ltpl .787- 
GN construct; 

Figure 8 is a longitudinal section of a developing rice grain post expression of 
5 the modified Ltpl gene promoter; and 

Figure 9 is a longitudinal section of a mature germinating rice grain post 
expression of the modified Ltpl gene promoter. 

10 A. METHODS 

i. Plant material 

Caryopsis of barley {Hordewn vulgare cv. Bomi) were collected from plants grown 
IS in a phytotron as described before (Kvaale and Olsen, 1986). The plants were 
emasculated and pollinated by hand and isolated in order to ensure accurate 
determination of caryopsis age. 

ii. cDNA and genomic clones 

20 

The isolation and sequencing of the L^l cDNA clone was conducted as described by 
Jakobsen et aL (1989). A barley, cv. Bomi genomic library was constructed by 
partial Mbol digestion of total genomic DNA and subsequent ligation of the 10-20 
kilo basepair (kb) size fraction with BaniHl digested lambda EMBL3 DNA (Clontech 
25 Labs, Palo Alto, Ca, USA). Using the Ltpl cDNA insert as a template for probe 
synthesis with a random labelling kit (Boehringer-Mannheim), one positive clone was 
identified after repeated rounds of plaque hybridization. DNA purified from this clone 
was restricted with several enzymes and characterized by Southern blot analysis. The 
sequence data obtained after this procedure are shown in Figure 3. 
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iii. In situ hybridization 

For in vitro transcription of antisense RNA, the plasmid Ltpl was linearized and 
transcribed using MAXIscript (Ambion) and [ct ^^P]-UTP (Amersham International). 
5 The probe was hydrolysed to fragments of about 100 bp as described by Somssich et 
al. (1988). Caryopsis tissues were fixed in 1% glutaraldehyde, 100 mM sodium 
phosphate (pH 7.0) for 2 hours and embedded in Histowax (Histolab, Goteborg, 
Sweden). 

10 Barley caryopsis sections of 10 ^tm were pre-treated with pronase (Calbiochem) as 
described by (Schmelzer et al. , 1988) and hybridized with 25 ml of hybridization mix 
(200 ng probe mi-1, 50% formamide, 10% (w/v) dextran sulphate, 0.3 M NaCl, 10 
mM Tris-HCl, 1 mM EDTA (pH 7), 0.02% polyvinyl-pyrrolidone, 0.2% FicoU, 
0.02% bovine serum albumin) for 15 hours at 50 **C. 

15 

Post-hybridization was carried out according to Somssich et al. (1988) and auto- 
radiography was done as described by Schmelzer et al. (1988). 

iv. Constructs for transient expression analysis 

20 

For the micro-projectile bombardment experiments, the following was used: 
pLtpi.787-GN (see Figure 7 and associated commentary). 
25 Isolated plasmid DNA was used in the bombardment smdies. 

For transient assay studies with rice protoplasts, the following were studied: 
pLtpl.787-GN (see Figure 7 and associated commentary). 

30 

pLtpl.787(-wiyfc/myc)-GN (see commentary below). 
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Deletion studies were performed on the modified Ltpl gene promoter (Ltp 1.787) 
wherein a section of DNA containing the myb and myc sites (see Figure 3 and 
associated commentary) was removed to form pLtpl.787(-my^?/myc)-GN. In this 
embodiment, the modified Ltpl gene promoter having deletions from and between the 
5 myb and myc sites was prepared and fused to GN. In order to prepare this deleted 
modified Ltpl gene promoter a PGR strategy using primers covering the flanking 
sequences of the deleted sequence was adopted. 

V. Transformation of barley ceils by particle bombardment 

10 

Barley caryopsis were harvested at 25 DAP (days after pollination), surface sterilized 
in 1 % sodium hypochlorite for 5 min and then washed 4 times in sterile distilled 
water. The maternal tissues were removed to expose the aleurone layer and the 
caryopsis was then divided into two, longitudinally along the crease. The pieces of 
15 tissue were then placed, endosperm down, onto MS media (Murashige & Skoog 1962) 
with 10 g/1 sucrose solidified with 10 g/1 agar in plastic petri dishes (in two rows of 
4 endosperm halves per dish). Embryos from the same caryopsis were placed in the 
same petri dishes with tibie scutellum side facing upwards. 

20 Single bombardments were performed in a DuPont PDS 1000 device, with M-17 
tungsten pellets (approx. 1 fim in diameter) coated with DNA as described by 
Gordon-Kanun et al. (1990) and using a 100 mm mesh 2 cm below the stopping 
plate. Histochemical staining for GUS expression was performed with X-Gluc (5- 
bromo,4-chloro,3-indolyl,fi-D,Glucuronic acid) as described by Jefferson (1987) at 

25 37 •€ for 2 days. 

In these studies, after bombardment with the pLTpl.787-GN and staining for GUS- 
activity, blue spots appeared both in the aleurone layer as well as in the scutellar 
epithelium layer. These results demonstrate that the 787 bp firagment of the L^l 
30 gene promoter of the present invention is capable of driving transcription in the 
epithelial cells. 
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vu Rice transformation 

In these studies, the gene was transformed into rice by electroporation of 
embryogenetic protoplasts following the teachings of Shimamoto et al. 1989. Six 
5 fertile transgenic rice plants were obtained. Histochemical GUS analysis was also 
carried out with developing rice grains of 25 DAP and 1 to 5 day old seedings and 
up to 1 month old plants derived from transgenic grains. The results demonstrated 
expression of the Ltpl - GUS gene in the scutellar epithelial layer of developing 
transgenic rice plants. In addition, in a germinating rice seedling according to the 
10 present invention there is a pronounced expression in the vascular tissue. 

R RESULTS AND DISCUSSION WITH REFERENCE TO TH E FIGURES 

1. In order to explain more fully the results, reference is made to Figure 1 which 
15 shows the major components of a typical developing caryopsis (or grain) 1. In this 

regard, the caiyopsis (or grain) 1 comprises an endosperm component 3 and an 
embryo component 5. The endosperm component 3 is divisible into an outer aleurone 
layer 7, which is three cells thick for barley caiyopsis, and a starchy endosperm 9. 
The embryo component 5 is divisible into a scutellum 11 and an embryo axis 13. 
20 The scutellimi 11 is further divisible into an epithelial layer IS and parenchyma layer 
17. Likewise, the embryo axis 13 is further divisible into a root component 19 and 
a shoot component 21. 

2. Figure 2 is a transverse section of a 30 day-old wild-tjrpe developing barley 
25 caryopsis showing in situ hybridisation with a radio-labelled Ltpl probe. The bound 

probe is only seen in the aleurone layer. It is not seen in any other tissue ^e, in 
particular the scutellar epithelial layer. This work confirms the work of Skriver et 
al, (1992). 



30 



The bright spots are due to optical interference. 
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3. Figure 3 shows the nucleotide sequence and the deduced amino acid sequence of 
LtpL The intron is indicated by lower case letters. The TGA stop codon is indicated 
by an asterisk, the putative CAAT and TATA sequences are indicated by boxes. A 
21 bp inverted repeat is indicated by arrows. Four 8 bp palindromic sequences are 
5 overlined . The motif indicated by thick underlining resembles the CATGTAAA motif 
present in the promoters of several genes expressed in aleurone cells (Klemsdal et al. 
(1991)). An AT block followed by a myb consensus recognition site and a myc 
binding motif are indicated by double underlining. 

10 4. Figure 4 shows the sequence of the Ltpl gene. The 351 bp open reading frame 
is interrupted by a 133 bp intron (+412 to +544). The transcript start site is at 
position +1. The putative CAAT and TATA boxes are at -107 and -34. A putative 
poly (dA) site is at +785 (Skriver et al. (1992)). 

15 5. Figure 5 is the nucleotide sequence of the preferred embodiment of the present 
invention, i.e. a 787 bp fragment of the Ltpl gene promoter. The same commentary 
for Figure 3 is equally applicable here. 

6. Figure 6 is an outline of the Ltpl genomic clone containing the Ltpl structural 
20 gene (shaded box) and the promoter fragment fiised to the GUS gene (black box) used 

to transform rice. Also indicated are the extensions of the Ltpl fragment described 
in Linnestad et al. (1991) and Skriver et al (1992). The figures used represent DNA 
fragment lengths in kb. The total length of the genomic clone is in the order of 8. 1 
kb. 

25 

7. Figure 7 helps explain how pLtpl.787-GN was constructed. In this regard, the 
following fragments were sequentially cloned into the vector Bluescript KS': firstly 
the 787 bp Sacl/Bcl fragment of the Ltpl gene promoter was cloned into the 
Sacl/Bcn site of the vector; and secondly a GC75-Nos Terminator on 2150 bp 

30 Smal/EcoRI fragment derived from pBIlOl was cloned into Smal/EcoRI downstream 
of the Ltpl promoter. 
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8. Figure 8 is a longitudinal section of a 30 day old transgenic rice grain showing 
transcriptional activity of the construct of Figure 7 (i.e. pLtpL787-GN) containing 
the promoter of Figure 5. It is to be noted that transcriptional activity is achieved in 
the scutellar epithelial layer, as denoted by the blue staining. 

5 

9. Figure 9 is a longitudinal section of a mature germinating transgenic rice grain 
showing transcriptional activity of the construct of Figure 7 (i.e. pLtpl.787-GN) 
containing the promoter of Figure 5. 

10 It is to be noted that transcriptional activity is achieved in the scutellar epithelial 
layer. Transcriptional activity is also observed in the shoot epithelial layer and in the 
aleurone layer. However, the extent of expression in the last two tissue types is not 
as pronounced as that in the scutellar epithelial layer. 

15 However, more importantly, with the transgenic rice transcriptional activity is 
observed in the vascular tissue of the germinating seedlii^ and the vascular tissue of 
the root and stem. 

SUMMATION 

20 

The Examples relate to the isolation of and to the use of a 787 bp fragment of the 
promoter for the barley Ltpl gene, which encodes a 10 kDa nsLTP. The gene was 
isolated by the use of a cDNA from a differential screening experiment in which the 
positive probe was constructed from alexirone cell poly (A) rich RNA, and the 
25 negative probe firom the starchy endosperm of immature grains. 

A construct comprising the Ltpl gene promoter fragment and a GOI (in this case 
GUS) was stably inserted into rice protoplasts. 

30 Expression and in situ analysis for the wild type gene promoter demonstrated that the 
Ltpl transcript is expressed in high levels only in the aleurone cells in developing 
barley caryopsis. This expression continued in germinating grains and also in 
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plantlets and mature plants. 

However, for transgenic cereals, especially rice, even though there is some expression 
in the aleurone layer for the modified Ltpl gene promoter it is, however, not as 
5 pronounced as that in each of the epithelial cells of the scutellimi, the epidermal cells 
of the coleoptile and the vascular strands of the embryo of developing caryopsis (or 
grain). 

This result is completely unexpected as it shows that a modified Ltpl promoter can 
10 function differently in transgenic cereals, especially rice, than the wild-type Ltpl gene 
in barley. 

Expression and histochemical analysis for the transgenic rice demonstrated that the 
Ltpl transcript is expressed in high levels in the scutellar epithelial tissue and vascular 
15 tissue, especially of a germinating rice seedling and a developing rice grain and a rice 
plant (e.g. in the root, leaves and stem). This expression continued in germinating 
grains and also in plantlets and mature plants. 

Importantly, for rice, expression is observed in the vascular tissue of the germinating 
20 seedling and the vascular tissue of the root and stem. 

This result is completely unexpected in view of the expression pattern of wild-type 
Ltpl gene in barley. 

25 Using the 787 bp promoter fragment in particle bombardments of developing barley 
caryopsis, we obtained activity (blue spots) in the epithelium layer of the scutellimi. 

The results therefore indicate that the modified Ltpl gene promoter directs e;q)ression 
of a GOI predominantly in the aleurone cells of developing caryopsis, particularly for 
30 barley, or the scutellar epithelial tissue or vascular tissue of a germinating seedling 
or a developing grain or a plant (e.g. in the root, leaves and stem) particularly for 
rice. 
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The modified Ltpl gene promoter therefore represents a valuable tool for the 
expression of GOIs in the aleurone layer of developing caryopsis, in particular 
developing barley caryopsis. 

5 Moreover, the modified Ltpl gene promoter represents a valuable tool for the 
expression of GOIs in the scutellar epithelial cells and vascular cells of germinating 
seedlings or developing grain, in particular developing or germinating rice seedlings 
or grain. The epithelial or vascular expression is of particular benefit because the 
787 bp LTPl gene fragment can be used to express antisense a-amylase in the 
10 scutellar epithelial layer in order to reduce or to prevent damage due to preharvest 
sprouting or to introduce or enhance pathogen resistance. 

One possible reason for the expression activity of the modified Ltpl gene promoter 
of the present invention may be the absence of "silencer" elements in the modified 
15 gene promoter which prevent expression of the wild type gene in, for example, the 
scutellar epithelial layer and vascular cells. Accordingly, the term "modified" (as 
defined above) could include removal of any silencer elements from the wild type 
Ltpl gene promoter. 

20 Studies with the modified L^l gene promoter having deletions from and between the 
myb aiui myc sites when fused to GN showed that the relative activity of the deleted 
modified Ltpl gene promoter was less (in some cases 70% less) than the modified 
Ltpl gene promoter which contains the myb and myc sites. Therefore, it is believed 
that the presence of the myb and myc sites are important for even higher levels of 

25 expression of the modified Ltpl promoter in at least protoplasts of at least rice. 

Accordingly the present invention also covers a method of enhancing the in vivo 
expression of a GOI in at least the aleurone layer of a developing caiyopsis or in at 
least the scutellar epithelial tissue or vascular tissue of a germinating seedling or a 
30 developing grain or a plant preferably of an embryo of a developing monocotyledon 
grain or caryopsis, comprising stably inserting into the genome of those cells a DNA 
constmct comprising a modified Ltpl gene promoter and a GOI, wherein m the 
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formation of the construct the modified Ltpl gene promoter is ligated to the GOI in 
such a manner that each of the myb site and the myc site in the modified Ltpl gene 
promoter is maintained substantially intact. 

5 The present invention also covers the use of a myb site and a myc site in a modified 
Ltpl gene promoter to enhance in vivo expression of a GOI in at least the aleurone 
layer of a developing caryopsis or in at least the scutellar epithelial tissue or vascular 
tissue of a germinating seedling or a developing grain or a plant, preferably of an 
embryo of a developing monocotyledon caryopsis or grain, wherein the modified Ltpl 
10 gene promoter and the GOI are integrated into the genome of the monocotyledon. 

Each of these aspects is applicable to the combination expression system. 

CONCLUSIONS VIS-A-VIS THE SPECIFIC EXAMPLES 

15 

1. The barley L^l gene encodes a protein homologous to the 10 kDa wheat lipid 
transfpr protein. 

2. The wild type Ltpl gene promoter is expressed in developing barley aleurone 
20 cells. 

3. The modified Ltpl gene promoter is transiently expressed in developing barley 
scutellar epithelial cells after particle bombardment. 

25 4. The modified L^l gene promoter directs expression of the GC/S-reporter gene 
in the scutellar epithelial cells of developing transgenic rice grains. However, 
more pronounced expression is observed in the vascular tissue of germinating 
seedlings and the root and stem of the transgenic rice plant. 



30 5. 



The modified Ltpl gene promoter contains sequence elements implicated in the 
transcriptional control of cereal endosperm specific genes. 
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6. The modified Ltpl gene promoter contains myb and myc sequence elements that 
are implicated in the level of transcription in cereal endosperm. 

Other modifications of the present invention will be apparent to those skilled in the 
5 art without departing from the scope of the invention. 
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40 



45 



(1) GENERAL INFORMATION 



5 NAME OF APPLICANTS: 
BUSINESS ADDRESS: 



O.-A. OLSEN AND R. KALLA 

PLANT MOLECULAR BIOLOGY LABORATORY 

DEPARTMENT OF BIOTECHNICAL SCIENCES 

AGRICULTURAL UNIVERSITY OF NORWAY 

AND AGRICULTURAL BIOTECHNOLOGY PROGRAM NRG 

NORWAY 

N-1432 



TITLE OF INVENTION: PROMOTER 



15 (2) INFORMATION FOR SEQUENCE I.D. 1 



SEQUENCE TYPE: 
MOLECULE TYPE: 
ORIGINAL SOURCE: 
SEQUENCE LENGTH: 
STRANDEDNESS: 
TOPOLOGY: 
SEQUENCE: 



NUCLEIC ACID 
DNA (GENOMIC) 
BARLEY 

787 

DOUBLE 
LINEAR 



-787 
-750 

-670 

-590 

-510 

-430 

-350 

-270 

-190 

-110 

-30 



GAGCTCC 
ACATCCAAGA 
GAGTAAACGG 
ACAAAGTA6T 
ATTTT ACGT6 
TAATATTTTT 
GATTACGCCA 
CACGGGAAGA 
CTTCTCTGTT 
CAATACTGCA 
AAACACCTCA 
CTGCAGTATT 
AAACAGGGCC 
AGACTCGGCG 
TGATGGTTGG 
6CCTGA6CGG 
CGATTTGGCC 
TCGAACCCCT 
TCCACACCTC 
GTACTGTTAG 



AAGGCATCAC 
AAGATATGTA 
AGGAAGTATA 
AAAAAAACTA 
TAGATAGAAA 
TGCAGTATTC 
CATATTACTG 
AGATAAC6AC 
TTTTAAAAAG 
GTTTTAAAAT 
TTGTAAATAA 
CTAAAAATAC 
TAAGGAGTTA 
AGGCACCAGC 
CAAAGCCGAG 
GAGATACAAT 
CGCCGACTAA 
ATTTAAGCCC 
CACGAGTTGC 
CTACAGATTA 



CAAGCTTCTA 
CTAGGATACC 
ATATAAGGCC 
AAGTATTAAA 
ATACCATGGT 
ACAATGTAGA 
CAGTTTAGAT 
GTCCCACCCC 
AGGTCTGGGG 
CACAATTCTT 
AACTATGATA 
TACAAAAATT 
AAAAAATTTA 
AGCTAGCAGT 
TCGACGTGTC 
CTGTTCTCCA 
AGCATCCAGG 
CTCCATTCCT 
TCATCACTAG 
AGAAGTGATC 



TGACGCCAAA 
AAGCACCCAA 
CTGTTTGATA 
AACTGCAGTA 
TTTAATATAA 
GAAACTGTTT 
CGAGCAAGTA 
TTCT TTTCGC 
TTAGIIIIII 
AGAGGCAACC 
ATCTCCAAAA 
CTTTGTTATC 
GCCGTAACTG 
CATCAACACT 
GC66GGCTC6 
GTAACCCCGT 
CATCTCTCGC 
CCCAACATTC 
CTA6TACGTT 



NOTE- ABOVE SEQUENCE IS A RETYPED VERSION OF FIGURE 5 WHICH IS TO BE 
TAKEN AS THE CORRECT SEQUENCE 
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(3) INFORMATION FOR SEQUENCE I.D. 2 



10 



15 



20 



25 



30 



SEQUENCE TYPE: 
MOLECULE TYPE: 
ORIGINAL SOURCE: 
SEQUENCE LENGTH: 
STRANDEDNESS: 
TOPOLOGY: 
SEQUENCE: 

-807 

-780 GTTAACCGTC 
AAGCCGATGA 
CAAGAGGTTT 

-660 TCTAGTAGTA 
AAATATTTTG 
ATGTCACTCT 

-540 AGGTTTTGAC 
ACAATTTTAT 
GTATCACAAA 

-420 GAATGTGAAA 
TTCATGGCAT 
TAAGAAAAAA 

-300 TCATGAGACA 
6GATGAT6CG 
CGCCTACC6C 

-180 CGAACGACCC 
GCTACCTTCG 
GT6CCCCCGC 
-60 GTGCGTGGCT 
A6CTA6AAAC 



NUCLEIC ACID 

DNA (GENOMIC) 

BARLEY 

-807 

DOUBLE 

LINEAR 



GATCTCG 
TCTTCGTGAG 
GGATAAATAA 
ACTCATCAAG 
CATC6GACCT 
TGCTCATTTA 
AGGTTTTGAC 
AAATAATTTC 
TTTACnTTA 
TGCCACTCTA 
AAAAACACTC 
GGAAATGT6A 
TT6TACTCCT 
ATCGCGTTTG 
CATGAATGGA 
CCACTGAGTC 
AGCTGACCTC 
TCAGCGACGA 
ATGCATGGCG 
6GCTACAAAT 
TTACACaGC 



ATGTGTAGTC 
AATAACCGTG 
AATGTGGTGG 
AGGATGCTTT 
CACATACCTC 
GTGATGGGTA 
ATTTCAGTTT 
CATTCCGCGG 
CCACTCTTAG 
GAAATTCTGT 
ACTTATTTGA 
CATAAAGTAA 
CGTAACAAGA 
GAAGGCTTTG 
GTCGTCTGCT 
CGGGCGGCAA 
TACCGACCGG 
TGGCCGCGTA 
GCACATGGCG 
ACGTACCCCG 



TACGASA/V3G 
GCCTAAAAAT 
TAC/iGTACTT 
TCCGATGyifiC 
CATTGTGGTG 
AAIIIIGIII 
"reCCACTCTT 
CAAA/\GCAAA 
CTTTCACAAT 
TTATGCCACA 
fiGCCMGETG 
CGTICGTCTA 
6ACGGAAACA 
CATCACCTTT 
TCCTAGCCTT 
CTACCATCQG 
ACmSAATCC 
CGCrn3QCGAC 

TCAGfTlGCCCT 
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INDICATIONS RELATING TO A DEPOSITED MICROORGANISM 

(PCX Rule I3bis) 

A. Tbe indications made below relate to tbe microorganism referred (o in (be description 
on page ^ jLM- Jines 

B. IDENTIFICATION OF DEPOSIT Further deposiu are idcntiHed on an additional sheet Q 
Name ofdeposiury institution 

The National Collections of Industrial and Marine Bacteria Limited (NCIMB) 
Address of depositary institution futclttding postal code and country) 

23 St, Machar Drive 
Aberdeen 
Scotland 
AB2 IRY 



United Kingdom 



Date of deposit 




Acoesston Number 






11. "3Af4. 







C. ADDITIONAL INDICATIONS f/«n« bUnk ifiiot applieable) This infonnation is continued on an additional sheet Q 



In respect of those designations in which a European patent is sought, and any 
other designated state having equivalent legislation, a sample of the deposited 
microorganism will be made available until the publication of the mention of the 
grant of the European patent or until the date on which the application has been 
refused or withdrawn or is deemed to be withdrawn, only by the issue of such a 
sample to an expert nominated by the person requesting the sample. (Rule 28(A) 
EPC). 



D. DESIGNATED STATES FOR WHICH INDICATIONS ARE MADE (iftltcindications are not for all designated States) 



E- SEPARATE FURNISHING OF INDICATIONS (leave blank if noi applicable) 

The indications listed below will be submitted to the International Bureau later (specify tSie gena-al nature of the indications e.g., "Accession 
Number of Deposit') 



For receiving OOice use only 



This sheet was received with the international application 



Authorized oCGcer 



For International Bureau use only 



n This sheet was received by the International Bureau i 



Authorized oflicer 



Fona PCT/RO/134 (July 1992) 
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CLAIMS 

1. A modified Ltpl gene promoter which is integrated, preferably stably 
integrated, within a plant material's genomic DNA and which is capable of inducing 

5 expression of a GOI when fused to the gene promoter in at least the aleurone cells or 
in at least the scutellar epithelial tissue or vascular tissue of a germinating seedling 
or a developing grain or a plant (e.g. in the root, leaves and stem). 

2. A modified L^l gene promoter according to claim 1 wherein the plant 
10 material is a developing caryopsis, a germinating seedling, a developing grain or a 

plant and wherein the gene promoter is integrated, preferably stably integrated, in the 
developing caryopsis 's genomic DNA or the germinating seedling's genomic DNA 
or the developing grain's genomic DNA or the plant's genomic DNA and which is 
capable of inducing expression of a GOI when fused to the gene promoter in at least 
15 the aleurone cells of the developing caryopsis or in at least the scutellar epithelial 
tissue or vascular tissue of a germinating seedling or a developing grain or a plant 
(e.g. in the root, leaves and stem). 

3. A modified Ltpl gene promoter according to claim 1 or claim 2 wherein the 
20 promoter comprises the nucleic acid sequence shown as SEQ. I.D. 1, or a sequence 

that has substantial homology with that of SEQ. LD. 1, or a variant thereof. 

4. An isolated Ltpl gene promoter comprising the sequence shown as SEQ. LD. 
1, or a sequence that has substantial homology therewith, or a variant thereof. 

25 

5. A construct comprising 
a GOI and 



30 



a modifled Ltpl gene promoter according to any one of claims 1 to 4; 
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wherein the construct is capable of being expressed in at least the aleurone 
cells or in at least the scutellar epithelial tissue or vascular tissue of a plant material; 
and 

5 wherein if there is expression in just the aleurone layer of a developing barley 

caryopsis then the fused promoter and GOI are not the 769 bp fragment of Skriver 
et al (1992). 

6. A construct according to claim 5 wherein the construct is capable of being 
10 expressed in at least the aleurone cells of a developing caryopsis or in at least the 
scutellar epithelial tissue or vascular tissue of a germinating seedling or a developing 
grain or a plant when the constmct is integrated, preferably stably integrated, within 
the caryopsis's or grain's or seedling's or plant's genomic DNA. 

IS 7. A construct according to claim 5 or claim 6 wherein the modified Lq>l gene 
promoter comprises the nucleic acid sequence shown as SEQ. I,D. 1, or a sequence 
that has substantial homology with that of SEQ. I.D. 1, or a variant thereof. 

8. The construct according to any one of claims S to 7 wherein the construct 
20 further comprises at least one additional sequence to increase expression of the GOI. 

9. An expression system for at least the aleurone cells or for at least the scutellar 
epithelial tissue or vascular tissue of a plant material, the expression system 
comprising 

25 

a GOI fused to a modified Ltpl gene promoter 

wherein the expression system is capable of being expressed in at least the 
aleurone cells or in at least the scutellar epithelial tissue or vascular tissue of the plant 
30 material; and 
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wherein if there is expression in just the aleurone layer of a developing barley 
caryopsis then the fused promoter and GOI are not the 769 bp fragment of Skriver 
et al (1992). 

5 10. An expression system according to claim 9 wherein the expression system is 
for at least the aleurone cells of a developing caryopsis or for at least the scutellar 
epithelial tissue or vascular tissue of a germinating seedling or developing grain or 
plant (e.g. in the root, leaves and stem). 

10 11. An expression system according to claims 9 or claim 10 wherein the 
expression system is additionally capable of being expressed in the embryo cells of 
the germinating grain or the plantlet. 

12. An expression system according to any one of claims 9 to 11 wherein the 
15 expression system is integrated, preferably stably integrated, within a developing 

caryopsis's genomic DNA or a germinating seedling's genomic DNA or a developing 
grain's genomic DNA or a plant's genomic DNA. 

13. An expression system according to any one of claims 9 to 12 wherein the gene 
20 promoter comprises the sequence shown as SEQ I.D. No. 1 or comprises a sequence 

that has substantial homology therewith, or is a variant thereof. 

14. An expression system for at least the aleurone cells of a developing caryopsis 
or for at least the scutellar epithelial tissue or vascular tissue of a germinating 

25 seedling or developing grain or plant (e.g. in the root, leaves and stem), the 
expression system comprising 

a gene promoter fused to a GOI 

30 wherein the expression system is capable of being expressed in at least the 

aleiirone cells of the developing caryopsis or in at least the scutellar epithelial tissue 
or vascular tissue of a germinating seedling or a developing grain or a plant (e.g. in 
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the root, leaves and stem); either 

wherein if there is expression in just the aleurone layer of a developing barley 
caryopsis then either the promoter is not the wild type Ltpl promoter in its natural 
5 enviroment and the GOI is not the Ltpl functional gene in its natural enviroment; or 

wherein if there is expression in just the aleurone layer of a developing 
caryopsis then the fiised promoter and GOI are not the 769 bp fragment of Skriver 
et al (1992). 

10 

15. An expression system according to any one of claims 9 to 14 comprising a 
construct according to any one of claims 5 to 8- 

16. A transgenic cereal comprising an expression system according to any one of 
15 claims 9 to 15 or a construct according to any one of claims 5 to 8 wherein the 

expression system or construct is integrated, preferably stably integrated, within the 
cereal's genomic DNA. 

17. The use of a gene promoter as defined in any one of the preceding claims to 
20 induce expression of a GOI when fused to the gene promoter in at least the aleurone 

cells or in at least the scutellar epithelial tissue or vascular tissue of a plant material. 

18. The use according to claim 17 wherein the gene promoter is used to induce 
expression of a GOI when fused to the gene promoter in at least the aleurone cells of 

25 a developing caryopsis or in at least the scutellar epithelial tissue or vascular tissue 
of a germinating seedling or a developing grain or a plant (e.g. in the root, leaves and 
stem). 

19. A process of expressing a GOI when fused to a gene promoter as defined in 
30 any one of the preceding claims, wherein expression occurs in at least the aleurone 

cells or in at least the scutellar epithelial tissue or vascular tissue of a plant material. 
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20. A process according to claim 19 wherein the gene promoter expresses the GOI 
when fused to the gene promoter in at least the aleurone cells of a developing 
caiyopsis or in at least the scutellar epithelial tissue or vascular tissue of a 
germinating seedling or a developing grain or a plant (e.g. in the root, leaves and 

5 stem). 

21. A process according to claim 19 or claim 20 wherein the promoter and GOI 
are integrated, preferably stably integrated, within a cereal's genomic DNA. 

10 22. A process of expressing in at least the scutellar epithelial tissue or vascular 
tissue of a developing grain or a germinating seedling or a plant, preferably a 
developing rice grain or a germinating rice seedling or a transgenic rice plant, an 
expression system according to any one of claims 9 to 15 or a construct according to 
any one of claims 5 to 8 wherein the expression system or construct is integrated, 

15 preferably stably integrated, within the cereal's genomic DNA. 

23. The invention of any one of claims 1 to 22 wherein the gene promoter is a 
fragment of a barley Ltpl gene promoter. 

20 24. The invention of claim 23 wherein the promoter is for a 10 kDa lipid transfer 
protein. 

25. The invention of claim 23 or claim 24 wherein the gene promoter is 
obtainable from plasmid NCIMB 40609. 

25 

26. The invention of any one of claims 1 to 15 wherein the gene promoter is used 
for expression of a GOI in a cereal caryopsis or a cereal grain or a cereal seedling 
or a cereal plant. 

30 27. The invention of claim 26 wherein the cereal caryopsis is a developing cereal 
caryopsis, the cereal grain is a developing cereal grain, and the cereal seedling is a 
germinating cereal seedling. 
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28. The invention of claim 26 or claim 27 wherein the cereal is any one of a rice, 
maize, wheat, or barley. 

29. The invention of claim 28 wherein the cereal is rice or maize, preferably rice. 

5 

30. The invention according to any one of claintis 1 to 29 wherein the developing 
caryopsis is a developing barley caryopsis, the germinating seedling is a germinating 
rice seedling, the developing grain is a developing rice grain, and the plant is a 
transgenic rice plant. 

10 

31. A combination expression system comprising 

a. as a first construct, a construct according to any one of claims 5 to 8; and 

15 b. as a second construct, a construct comprising a GOI and another gene 
promoter that is tissue- or stage-specific. 

32. A combination expression system according to claim 31 wherein each 
constract is integrated, preferably stably integrated, withm a plant materiaL 

20 

33. A combination expression system according to claim 32 wherein each 
constract is integrated, preferably stably integrated, within a developing caryopsis's 
genomic DNA or a grain's genomic DNA or a seedling's genomic DNA or a plant's 
genomic DNA. 

25 

34. A combination expression system according to any one of claims 31 to 33 
wherein the first constract comprises a modified Ltpl gene promoter comprising the 
nucleic acid sequence shown as SEQ. I.D. 1, or a sequence that has substantial 
homology with that of SEQ. I.D. 1, or a variant thereof. 

30 

35. A combination expression system according to any one of claims 31 to 34 
wherein the promoter in the second constract is an aleurone specific promoter. 
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36. A combination expression system according to any one of claims 31 to 35 
wherein the promoter in the second construct a barley promoter. 

37. A combination expression system according to any one of claims 31 to 35 
5 wherein the second construct is the B22E gene promoter. 

38. A combination expression system according to any one of claims 31 to 37 
wherein the promoter in the second construct is the Ltp2 gene promoter. 

10 39. A combination expression system according to claim 38 wherein the promoter 
in the second construct is for a 7 kDa lipid transfer protein. 

40. A combination expression system according to claim 38 or 39 wherein the 
promoter in the second constmct is the promoter for Ltp2 of Hordeum vulgare, 

15 

41. A combination expression system according to any one of claims 31 to 40 
wherein the promoter in the second construct comprises the sequence shown as SEQ. 
LD. 2, or a sequence that has substantial homology therewith,, or a variant thereof. 

20 42. A combination expression system according to any one of claims 38 to 41 
wherein each of the myb site and the myc site in the Up! gene promoter is 
maintained substantially intact. 

43. A combination expression system according to any one of claims 31 to 42 
25 wherein the second construct further comprises at least one additional sequence to 

increase expression of the GOI. 

44. A developing cereal grain, preferably a germinating rice seedling, comprising 
any one of: a promoter according to any one of claims 1 to 4 or any claim dependent 

30 thereon, an expression system according to any one of claims 9 to 15 or any claim 
dependent thereon, a construct according to any one of claims 5 to 8 or any claim 
dependent thereon, or a combination expression system according to any one of 
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claims 31 to 43 or any claim dependent thereon. 

45. The invention of any one of the preceding claims wherein each of the myb site 
and the myc site in the gene promoter is maintained substantially intact. 

5 

46. Plasmid NCIMB 40609. 

47. A promoter, a construct or an expression system or a combination expression 
system substantially as described herein with reference to any one of Figures 5 to 9. 

10 
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1 GAGCTCCAAGGCATCACCAAGCTTCTATGACGCCAAAACATCCAAGAAAGATATGTACTAGGATACCAAGCACCC 



1 6 AAGAGTAAACGCAGGAAGTATAATATAAGGCCCTGTTTGATAACAAAGTAGTAAAAAAACTAAAGTATTAAAAAC 



151 TGCAGTAATTTTACGTGTAGATAGAAAATACCATGGTTTTAATATAATAATATTTTTTGCAGTATTCACAATGTA 



22 6 GAGAAACTGTTTGATT ACGCCACATATTACTGCAGTTTAGATCGAGCAAGTACACGGGAAGAAGATAACGACGTC 
301 CCACCCCTTCTTTTCGCCTTCTCTGTTTTTT AAAAAGAGGTCTGGGGTTAGTTTTTTCAATACTGCAGTTTTAAA 
37 6 ATCACAATTCTTAGACGCAACCAAACACCTCATTGTAAATAAAACTATGATAATCTCCAAAACTGCAGTA 



4 51 AAAATACTACAAAAATTCTTTGTTATCAAACAGGGCCTAAGCAG TTAAAAAAATTTAG CCC TAACTGA GACTCGG 
52 6 CGAGGCACCAGCAGCTAGCAGTCATCAACACTTGATGGTTCCCAAACCK.CAGTCGACGTGTCGCGGGGCTCGGCC 



60 1 TCAGCGGGAGAT qCAATp TGTTCTCCAGTAACCCCGTCGATTTGGCCCGCCGACTAAAGCATCCAGtK^TCTeTe 
67 6 GCTCGAACCCCgAgr^CCCCTCCATTCCTCCCAACATTCTCCACACCTCCACCAGTTGCTCATCACTAGCTA 
75 L GTACGTTGTACTGTTAGCTACAGATTAAGAACTCATC ATG GCC CGC GCT CAG GTA CTG CTC ATG 



MARAQVLLM 



815 


GCC 


GCC 


GCC 


TTG 


GTG 


CTG 


ATG 


CTC 


ACC 


GCG 


GCC 


CCG 


CGC 


GCT 


GCC 


GTG 


GCC 


CTC 


AAC 




A • 


A 


A 


L 


V 


L 


H 


L 


T 


A 


A 


P 


R 


A 


A 


V 


A 


L 


N 


872 


TGC 


GGC 


CAG 


CTT 


GAC 


AGC 


AAG 


ATG 


AAA 


CCT 


TGC 


CTG 


ACC 


TAC 


GTT 


CAG 


GGC 


GGC 


CCC 




C 


G 


Q 


V 


0 


S 


K 


M 


K 


P 


C 


L 


T 


Y 


V 


Q 


G 


G 


P 


929 


GGC 


CCC 


TCC 


GGC 


GAA 


TGC 


TGC 


AAC 


GGC 


GTC 


AGG 


GAT 


CTC 


CAT 


AAC 


CAG 


GCG 


CAA 


TCC 




G 


P 


S 


G 


E 


C 


C 


N 


G 


V 


R 


0 


L 


H 


M 


Q 


A 


Q 


s 


986 


TCG 


GGC 


6AC 


CGC 


CAA 


ACC 


6TT 


TGC 


AAC 


TGC 


CTG 


AAG 


GGG 


ATC 


GCT 


CGC 


GGC 


ATC 


CAC 




S 


G 


D 


R 


Q 


T 


V 


C 


M 


C 


L 


K 


G 


I 


A 


R 


G 


I 


H 


1043 


AAT 


CTC 


AAC 


CTC 


AAC 


AAC 


GCC 


GCC 


AGC 


ATC 


CCC 


TCC 


AAG 


TGC 


AAT 


GTC 


AAC 


GTC 


CCA 




N 


L 


N 


L 


N 


N 


A 


A 


S 


I 


P 


S 


K 


C 


N 


V 


N 


V 


P 


1100 


TAC 


ACC 


ATC 


AGC 


CCC 


GAC 


ATC 


GAC 


TGC 


TCC 


AG gtgaccaaactcaeacceacccagagtgaaat 




Y 


T 


I 


S 


P 


D 


I 


D 


C 


S 


R 



















1164 cttcaaaaagaactatacctacgaacggagcgagcatacaggaacatteacccacgcaaaactcgctgatactaa 



1240 cattaacacgcacgattgacctgcag G ATT TAC TGAGCGACGATCCGTCAAGCTGGTGCTCAGCTCATCGA 

I Y • 

1310 TCCAC6TGGAGCTGAAGCGCGCAGCCTCTCTCCCTATGTAGTATGGCTACCAGTTATGCCGAGTTTATGCTGAAT 
1385 AAGAACTCTCTCCTGTACTCCTTTGGAGGAGATCAGTATCTATGTACGTGAGAGTTGAGAGTTTGTACCATCG6C 
14 60 ACTCCCAGTGTTTATGGACTATATGCAT 
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CTCCAOVACTCATCyiGCAtCyiCGGAATGCCATCyiCTTGAAATATAACTACATTCCTCAAA - 1 62 3 
CaU^CAAAAAGCACATT AGAATCTTCyiGCATTCACaiTAAjtyiCTTW - 1 5 6 3 

TATATATTTTGAGJUITCCTTTGGACGAGAAAAATCCATATTTACAATTCCTTGTAAATTT -1503 
GAGTCCATGATCCTAAAGACATTAAGCATCC6AATTACCCAAACATCAAAATTTCTGCCA - 1 4 4 3 
TTCAAACTAACAGTGTTAGAGAATCCTAATCCCCTACTTGACATACTTACTCTCTACGTC - 1 3 8 3 
GTGAAACCTAATAATGAGAGATCTAGCTCTAATACCAATTGAGAGGATGTGGATGTC6CC -1323 
TAGAGGGGCGGTGAATAGGCGCTTTAAAATAATXACGGTTTAGGCTCGAACAAATGT6GA - 1 2 € 3 
ATAAAACTAACGTTTCATTTGTCAACCGCAAAACCTAAAACAACXACGCTCACCTATCTC - 1 2 0 3 
CACCAACJU^CTTATGATAAGCAACUITAAAAAAACTAAGTGATGGCAGAATATATA^ - 1 1 4 3 

AAACAATATGGCTATCACAAAGTGAAGTGCATAAGTAAACA6CTCGGGTAAGGGACAACC - 1 0 8 3 
GAGCCATGCGGAGACGACCATGTATCCTCAACTTCACACACTTGCGGATCCTAATCTCCG -102: 
TTTGAAGCAGTGTGGAGCCACAATCGTCCCCAAGAACCCACTAAGGCCACC6TAATCTCC - 9 6: 
TCACGCCCTCGCACAATCGAAGATGTTGTGATTCCACTAAGGGACCCT7GAGGGCA6TCA - 90 : 
CTGAACCCGTATAAACATGGTTGGAACAATCTCCACGACTTAATTGGAGACTCCCAACAA -84: 
CACO^CGAACCTXaiTCATAACGAAATATGCCTTCGAGGTAACCTCAAATGCTCGCCGCA -7 8 ! 
ATTTTTACAACCTAATTGAAGACCTCGACGCTTGCGTGGAGCTTTACACTATAATGATTG -72: 
AGCTCCAAGGGCftrCACCAASCTTCTATGACGCaUUUlCATCauiGAAAGA» -66; 
GGATACCAAGCACCCAAGAGTAAACGGAGGAAGTATAATATAAGCCCCTGTTTGATAACA - 6 0 : 
AAGTAGTAAAAAAACTAAAGTATTAAAAACTGCAGTAATTTTACGTGTAGATAGAAAATA -54! 
CCATGGTTTTAATATAATAATATTTTTTGCACTATTCACAATGTAGAGAAACTGTTTGAT -48 
TACGCCACATATTACTGCAGTTTAGATCGAGCAAGTACACGGGAAGAAGATAACGACGTC - 4 2 
CCACCCCTTCTTTTCGCCTTCTCTGTTTTTTAAAAAGAGGTCTGGGGTTAGTTTTTTCAA - 3 6 
TACTGCAGTTTTAAAATCACJUlTTCTTAGAGGCAACCAAACACCr CATTGT AAATAAA^ - 3 0 
TATGATAATCTCCAAAACTGCAGTATTCTAAAAATACTACAAAAATTCTTTGTTATCAAA - 2 4 
CAGGGCCTAAGGAGTTAAAAAAATTTAGCCGTAACTGAGACTCGGCGAGGCACCAGCAGC -18: 
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taccagtcatcaacac.xtgatggttggc:aaagccgagtcgacgtgtcgccgggctcggcc - 1 2 1 

tgagcgggagatacaatctgttctccactaaccccgtcgatttggcccgcccactaaagc - 6 1 

atccacgcatctctcgctcgaacccctatttaagcccctccarrcctcccaacattctcc - 1 

acaccrca^ccagttgctcatcactacctactxccttgtactgttagctacagxttaaga 6 0 

AGTGATCATGGCCCGCGCTCACCTACTCCTCArGGCCGCCGCCTTGGTCCTGATGCTCAC ' 120 
MARAQVLLMAAALVLML .T 

GGCGGCCCCGCGCGCTGCCGTGGCCCTCAACTGCCGCCAGGTTGACAGCAAGATGAAACC 130 
AAPRAAVALNCGQVDSKMKP 

TTGCCTGACCTACGTTCAGGGCGGCCCCGCCCCGTCCGGCGAATGCTGCAACGGCGTCAG 240 
CLTYVQGGPGPSGECCNGVR 

GGATCTCCATAACCAGGCGCAATCCTCGGGCGACCGCCAAACCGTTTGCAACTGCCTGAA i 0 0 
DLHKQAQSSGORQTVCNCLK 

GGGGATCGCTCGCGGCATCCACAATCTCAACCTCJUlCAACGCCGCaiGCATCCCCTCC^ 360 
GIARGIH NLNX-NNAASIPSK 

GTGCAAXGTCAACGTCCCATACACCATCAGCCCCGACATCGACTGCTCCAG9^9att aaa 420 
CNVKVPYTISPD I DCSR 

1 1 tacact: cat eeagagt gaaat ct tt aaaaagaaetat at 1 1 acgaacggagt gagt a t 480 

ataggaacattcatccacgtaaaatttgttgatattaacattaacacgcatgattgacct 540 

gcagGATTXACTGAGCGACGATCCGTCAAGCTGGTGCTCAGCTCATCGATCCACCTGGAG 600 
I r • 

CTGAAGCGCGCAGCCTCTCTCCCTATGTAGTATGGCTACCAGTTATGCCCyiGTTTATGCT 6 60 

GAATAAGAACTCTCTCCTGTACTCCTTTCGACGAGATCAGTATCTATGTACGTGAGAGTT 720 

GAGAGTCTGTACCATCGGCACTCCCAGTCTTTATGGACTATATGCATACACCTCCTTCTG 780 

TGCTCAGTGTGTAACTTGTCTCTCTCTTTCCrCACGTTCGCGTCTCATATAATAATTTAC 840 

TTATCTCCTCTAGGATCGTAGTACAGTATCATATATAXACCTCXCTATGAATTAGTTTAC 900 

CGTACACCGTATGTTTCTTGAATCTCGATGAAAATTACGGATTCAAGCGTGCGTCCCCCA 9 60 

TATAATAAGCTTGCTTACGGATTCAAGCGTGCGTCACGCGGCTCAGTAGATGATGAGGAT 1020 

ACTCGCTGCTGCATCTCTACATCCCCCTCATCAGCTGAGCTCACCCCGGGTCCTCCCCCG 1080 

CTCCGGCCCGCTGGCCACCCCGGCCGGCCGACCCTCAAACAGCCTTCATGACGAGCCGCC 1140 

CGCCAGCAAGATCTGTTGGCTCCTCCCCTGTCCGTCGTAGAGAAACCCAGCA 1192 
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1 GAGCTCCAAGGCATCACCAAGCTTCTATGACGCCAAAACATCCAAGAAAGATATGTACTAGGATACCAAGCACCC 
7 6 AAGAGTAAACGGAGGAAGTATAATATAAGGCCCTGTTTGATAACAJVAGTAGTAAAAAAACTAAAGTATTAAAAAC 
151 TGCAGTAATTTTACGTGTAGATAGAAAATACCATGGTTT^ 
226 cyVG AAACTGTTTGATTACGCCACATATTACTGCAGTTTAGATCGATO 

301 CC»CCCCTTCTTTTCGCCTTCTCTGTTTTTTAAAAAGAGGTCTGGGGTTAGTTTTTTCAATACTG^ 
376 ATCACAATTCTTAGAGGCAACaOU^a^CCTCATTGTA^^ 

4 51 AAAATACTACAAAAATTCTTTGTTATCAAACAGGGCCTAAGGAG'g^^^^^^GCCGT^TGAGACTC^ 
52 6 CGAG GCACCAGCA GCTAGCAGTCATCAACACTTGATGGTTGGCAAAGCCGAGTCGACGTGTCGCGGGGCTCGGCC 
601 TGAGCGGGAGATi> |cmt :TGTTCTCCAGTAACCCCGTCGATTTGGCCCGCCGACTAAAGCATCCAGGCATCTCTC 
67 6 GCTCGAACCCCgAT^^CCCCTCCATTCCTCCCAACATTCTCCACACCTCCACGAGTTGCTCATCACTAGCTA 
751 GTACGTTGTACTGTTAGCTACAGATTAAGAAGTGATC 
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Picld of the Invenlion 

This invention rvlutcs to modulating levels of enzymes and/or co^yim 
5 component capable of altering the production of long chain polyunsaturated faicy 

acid^s (PUFAS) in a host plant. The invention is excinplified by the prodiiction of 
PUFAS in plants. 

Three inain families of polyunsaturated fauy ucids (PUFA^) arc the 3 fatty 

10 acids, exemplified by arachidonic acid, the 0)9 fatty acids exemplilied by Mead 

acid, and the co6 fatty u*}ids, exempliried by eicosapentaenoic acid. PUFAs are 
important components of the plasma membrane of the cell, where ihey may be 
found in such forms as phospholipids, PUFAs also serve as precursors to other 
molecules of importance in humun beings and animals, including the prostacyclins, 

1 5 leukotnenes and prostaglandins. PUFAs are necessary for proper development. 

particularly in the developing infant brain, and for tissue formation and rupair. 

Four major long chain PUFAs of importance include docosahexaenoic acid 
(DHA) and eicosapcmacnoic acid (EPA), which ans pnmahly found in different 
types offish oil. gamma-linolenic acid (GLA). which is found in the seeds of a 

20 number of plants, including evening primrose (OenOihera biennis), borage (Bora^o 

officuialis) and black, currants {Rihes nigrum), and siearidonic acid (SDA), which is 
found in marine oils and plant seeds. Both GLA and another important long chain 
PUFA, arachidonic acid ( ARA), are found in filamentous Ringi. ARA can be 
purified from animal tissues including liver and adrenal gland. Mead acid 

25 accumulates in essential fatty acid deficient animals. 

For DHA. a number of sources e^^ist for commercial producUon including a 
variety of marine organisms, oils obtained from cold water marine fish, and egg 
yolk fractions. For ARA. microorganisms including the genera Monierellu. 
tntontophthoru, Phyiium and Forphyridium can be ascd fur conrunercial 

30 production. Commercial sources of SDA include the genera Trichode^mu and 

-2- 
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10 



bchium. Commercial source* of GLA include evening primrose, black cuiranls 
and borage. However, ihcrc are several disadvantages associuicd wilh commercial 
production of PUFAs from natural sources. Natural sources of PUFAs. such as 
animals and plants, lend to have highly heterogeneous oil compositions. The oils 
obuincd from these sourees therefore can requite extensive purification to separate 
out one or more desired PUFA=» or to produce an oil which is enriched in one or 
more POFA- Natural sources also are subject to uncontrollable fluctuations in 
availability. Fish stocks may undergo natural variation or may be depleted by 
oveifishing. Fish oils have unpleasant tastes and odor^. which may he impossible to 
economically separate from the desired prodaci. and can render such prodocu 
unacceptable as food supplements. Animal oils, and particulariy fish oils, can 
accumulate environmental pollutants. Weather and disease can cau« fluctuation in 
yields from both fish ami plant sources. Cropland available for production of 
alternate oil-producing crops is subject to competition from the steady expansion of 
15 human populatiom. and the associated increased need for food production on the 

remaining arable land. Crops which do produee PUFAs. such as borage, have not 
been adapted to commercial giowth and may not perform well in monoculture. 
Growth of such crops U thus not economically competitive where more profitable 
and better established crops can be grown. Large sciUe fermenuiion of organisms 
20 such as Mnrtierella is also expensive. Natural animal tissues contain low amounts 

of ARA and are difficult to process. Microorganisms such as PorphyridiMm and 
Mortiereila are difficult to culii vate on a commercial scale. 

Dietary supplements and pharmaceutical fbnnutaiions containing PUFAs 
can retain the disadvanuges of the PUFA source. Supplements such as fish oil 
25 capsules can contain low jfevels of the particular desired component and thus 

require large dosages. High dosages result in ingestion of high levels of undesired 
components, including comaminants. Care must be taken in providing fatty acid 
supplements, as overaddition may result in suppression of endogenous biosynthetic 
pathways and lead u, competition with other necessary faily acids in various lipid 
30 fractions in vivo, leading to undesirable results. For example. Eskimos having a 

diet high in fatty acids have an increased tendency to bleed (U.S. Pat. No. 
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4,874,603). Uaplcasani tastes and odors of die supplements can make such 
rtsgimens undesirable, and may inhibit compliance by the patient. 

A number of enzymes are involved in PUFA biosynthesis. Linoleic acid 
(LA. 1 8:2 A9. 12) is produced from oleic acid (18; I A9) by a AI2-dcsaiurasc. GLA 
5 ( 1 8:3 9, 1 2) is produced from linoleic acid (LA, 1 8:2 A9. 1 2) by a A6- 

dcsaturase. ARA (20:4 A5. 8, 1 1. 14) production from DGLA (20:3 A8. 11. 14) is 
catalyzed by a A5-desaturASc. However, animals cannot desatumte beyond the A9 
position and therefons cannot convert oleic acid (18:1 A9) into linoleic acid (18:2 
A9, 12), Likewise, tt-linolenic acid (ALA, 18:3 A9, 12, 15) cannot be synthesized 

10 by mammals. Other eukaryotes, including fungi and plants, have enzymes which 

desaturate at positions A2 1 and Al 5, The major poly-unsaiurated fatty acids of 
animals therefore arc cither derived from diet and/or from desaturation and 
elongation of linoleic acid ( 18:2 A9. 12) or oc-linolcnic acid (1 8:3 A9, 12, 1 5). 
Poly-unsaturatud faity acids are considered to be useful for nutritional. 

1 5 pharmaceutical, industrial, and otfier purposes. An expansive supply of poly- 

unsaturated fatty acids from natural sources and from chemical synthesis are not 
sufficienr for commercial needs. Therefore Lt is of interest to obtain genetic 
material involved in PUFA biosynthesis from species that naturally produce thci« 
farty acids and to express the isolated material alone or in combination in a 

20 heterologous system which can be manipulated to allow production of conunercial 

quantities of PUFAS. 

Qrry vy Aytv OF THE IN VENTION 
Novel compositions and methods are provided for preparation of poly- 
unsaturated long chain fatly acids and desaturases in plants and plant cells. The 
25 methods involve growing a host plant cell of interest transfoimed with an 

expression cassette functional in a host plant cell, the expression cassette 
comprising a transcriptional and translational initiation regulatory region, joined in 
reading fmme 5* to a DNA sequence encoding a dcsaturase polypeptide capable of 
modulating ihc production ofPUFAs. Expression of the desaturasc polypeptide 
30 provides for an aUcraiiim in the PUFA profile of host plant cells as a result of 
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altentd concentrations of enzymes involved in PUFA biosynthesis. Of panic«l«r 
•.merest is the selective control of PUFA production in plant tissues .nd/or piml 
p»n» such as leaves, roots, fruits and seeds. The invention finds use for exannple .n 
the large scale production of DHA. Mead Acid EPA. ARA, Stcaridonic acid and 
5 OLA and for modificalion of the fatty acid profile of edible plant tissues «id/or 

plant parts. 

ftPiiTF nRSCRf i^^^N THF. DRAWINGS 
Figure 1 shows possible pathways for the synthcsU of Mead acid (20-3 A5, 
8. ll).»r»chidonicacldl20:4A5.8. n. 14) and stearidonic acid (18:4 A6. 9. 12. 
JO 15) from palmitic acid (C,») from a variety of organisms, including algae. 

MortiereUa and humans. These PUFAs can serve as precursor* to other molecules 
imponam for humans »nd other animals, including prostacyclins, leokotrienes. and 
piosuglandtns, some of which are shown. 

Pigum 2 shows possible pathways for production of PUFAs in addition to 
15 ARA. including ta^oleic acid and pinolenic. again compiled from a variety of 

organisms. 

pp^ p i^f-^PITTIQN THE SEOUFNCF. l^ISTlNfig 
SEQ ID NO. 1 >hows DNA sequence fXm a ScMtoehytrium clone with 



20 



25 



homology to both A 1 2 and A 15 desaturases. 

SEQ ID NO 2 shows peptide sequence from a Sehizpchyirium clone with 
homology to both. A 12 and A15 desaturaaes. 

definitions are provided: 

AS-Desatorase: AS desaturase is an enzyme which introduces a double 
bond between carbons 5 and 6 from the carboxyl end of a feny acid molecule. 

A6-Desatiiraae: A6-desatunise is an enzyme which introduces a double 
bond between carbons 6 and 7 from the carboxyl emi of a fatty acid molecule. 

-S- 
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AV-DesatuHMei A9-desaiurase is an enzyme which introduces a double 
bond beiween carbons 9 and iO from the carboKyI end of a fauy acid molecule. 

AI2.Desatuntte: Al2-de»aturase is «n enzyme which introduces a double 
bond between carbons 12 and 13 from the carboxyl end of a fatty acid molecule. 

Fatty Adds: Fa«y acids arc a class of compounds containing a long 
hydrocarbon chain and a terminal carboxylate group. Fatty acids include the 



[- — ■ Fatly Add 


12:0 


Uuric aciJ 






Falmiik a«:id 




16:1 


PalmUoleic; octd 




18:0 


dearie *cid 




1H:1 


oleic m\6 


A"" 1 B. 1 


18:2 


Taxoleic acid 


A5.9-1 5^:2 


&&:2 ^6*9 


6.9-ocuai:cadienole acid 


A6,9-iK:2 


18:2 


Linolcii: acid 


A9.12-IH;2 (LA> 


lg:3 A6,9,12 


Oainnu-liiiolcnic idd 


12-18:3 (GLA> 


|K:3 


fSnolcaic acid 


A5,9.I2-18:3 


18:3 


alpha-linolcnic acid 


A9. 12, 1 5- 18:3 «ALA) 


18:4 


Stearidonic uctd 


A6.9.12.l5-I8:4(SDA^ 


20:0 


Anchidic acid 




20:1 


Eicosccnic Acid 




20.2 ^8. 1 1 




^8. ] 1 


20:3 A5, 8. 1 1 


Mead Acid 


A5. 8. 1 1 


22:0 


Betichic acid 




22:1 


erucic acid 




22:2 


Docasadicnoic acid 




20:4 cj6 


ATiftchidonic acid 


" A5.8. 1 1.14-20:4 (ARA) j 


j 20:3 w6 


a)6»cicosalrienoic 
dihomv-s^nitna linolcnic 


" A8JI . 1 4-20:3 (DGLA) 


20:3 cj3 


£ti»tS'jpentanoic 
(Timnciacmic acid) 


A5.8.I M4.l7-20:5 (EPA) 


20:3 €-»3 


u>3 -^icoitabienoic 


An.l6.l7-20-J 


20:4 o3 


Gi3-eicoMtetraenoic 


A8. 11. 14. 17-20:4 


22:5 o3 


Docokupeniaenoic 


" A7,iai3. 16.19-22:5 <«.>3DPA> 
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Fatty Acid 


22:6 cj3 


Oocosahexa«noic 
(cervcMttc acid) 


A4.7,10.I3.16.19.22:6 (DHA) 


24:0 


Lignpccric acid 





Taking into acxount these definitions, the present invention is directed to 
novel DN A sequences , DN A constructs, metlipds and compositions an provided 
which permit modification of the polyunsaturated long chain fatty acid content of 
S plant cells. Plant cells are transformed with an expression cassette comprising a 

DNA encoding a polypeptide capable of increasing the amount of one or more 
PUFA in a pl«ni cell. Desirably, integration constnicis may be prepared which 
provide for integiwion of the expression cassette into the genome of a host cell. 
Host cells are manipulated to express a sense or antisense DNA encoding a 
10 polypeptldc(s) that ha-s desaturase activity. By "desaturase" is Intended a 

polypeptide which can tle.-taturatc one or more fatty acids to produce a mono- or 
poly-unsaturated fatty acid or precursor thereof of interest. By "polypeptide" is 
meant any chain of amino acids, regardless of length or post-transladonal 
modification, for example, glycosylalion or phosphorylation. The substr8tc($) for 
1 5 the expressed enryme may be produced by the host cell or may be exogenously 

supplied. ' 

To achieve expression in a host cell, the transformed DNA is operably 
associated with transcriptional and transhitional initiation and teirnination 
regulatory regions thai are functional in the host cell. Constiucts comprising the 
20 gene to be expressed can pn>vide for integration into the genome of the host cell or 

can autonomously replicate in the host cell. For production of linoleic acid (LA), 
-the expression cassettes generally used include a cassette which provides for A 1 2 
desaturase aeUvity. paniculariy in a host cell which produces or can take up oleic 
acid. For production of ALA. the expression cassettes generally used include a 
25 cassette which provides for A 1 5 or ©3 desaturase activity, particularly in a host cell 

which produces or can take up LA. For production of GLA or SDA. the expression 
cassettes generally used include a cassette which provides for A6 desaturase 
activity, particulariy in a ho.« cell which produces or can take up LA or ALA. 

-7- 
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respectively. ProducUon of (o6-iype unsatoruied faity acids, such as LA or GLA. is 
favored in a plant capable of producing ALA by inhibiting the activity of a A 1 5 or 
©3 type desatunwB; diis is accomplished by providing an expression cassette for an 
antisense A 15 or «>3 transcript, or by disrupting a A15 or o>3 desaturase gene. 

5 Similarly, production of LA or ALA is favored in a plant having A6 desaturase 

activity by providing an expression cassette for an antisense A6 transcript, or by 
disfupcing u A6 <tosutun.se gene. Production of oleic acid likewise is favoied in a 
plant having Al 2 desaturase activity by providing an expression cassette for an 
antisense A12 transcript, or by disrupting a Al 2 desaturase gene. For producdoo of 

10 ARA. the expression cwsetie generally used provides for A5 desaturase activity. 

particularly in a ho.st cell which produces or can take up DGLA. Production of <a6- 
type unsaturated fatty ^cids. such as ARA. is fevered in a plant capable of 
producing ALA by inhibiting the aaivity of a AI5 or o»3 type desaturase: rhU is 
aceotnpUshed by pioviding an expression cassette for an antisense AI5 or o»3 

I S transcript, or by disrupting a A 1 5 or toS desaturase gene. 

TRANSGENIC PLANT PRODUCTION OF FATTY ACIDS 

Transgenic plant production of PUFAs offers several advantages over 
purification from natural sources such as fish or planu. Production of faity acids 
from recombinant plants provides the ability to alter the naturally occurring plant 

20 fany acid profile by providing new synthetic pathways in the host or by suppressing 

utidesif«d pathv^ays. thereby increasing levels of desired PUFAs. or conjugated 
forms dtereof. and decreasing levels of undesired PUFAs. IVoduction of fatty acids 
in transgenic plants also offers the advantage that expression of desaturase genes in 
particular tissiues and/or plant parts means that greatly increased levels of desired 

25 PUFAS in those tissues and/or parts can be achieved, nuking recovery from those 

tissues more economical. For example, the desired PUFAs can be expressed in 
seed; methods of isolating seed oils are well esUblished. In addition to providing a 
source for purification of desired PUFAs. seed oil components can be manipulated 
through expression of desaturase genes, either alone or in combination with other 

30 genes such as elongases. to provide seed oils having a particular PUFA profile in 

concentrated form. The concentrated seed oils then can be added to animal milks 
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a polypeptide having A5 desacura^c ucttvity. In particular instances, ihis can be 
coupled with an expression cassette which provides for production of a polypeptide 
hsiving A6 desaiuiase activity and. optionally, a iranscriprion cassette providing for 
production of antisense sequences to a A 15 transcription product. The choice of 
5 combination of cassettes used depends in part on the PUFA profile of the host cell. 

Where die host cell AS-desaturase activity is limiting, overexpression of AS 
desaturase alone generally will be sufficient to provide for enhanced ARA 
production. 

SOURCES OF POLYPEPTIDES 
10 HAVING DESATURASE ACTIVrrY 

As sources of polypeptides having desaturase activity and oligonucleotides 
encoding such polypeppdes are organisms which produce a desired poly- 
unsaturated feciy acid. As an example, microorganisms having an ability to 
produce ARA can be used as a source of A5-desanirase genes; microorganisms 

1 5 which GLA or SDA can be used as a source of A6-desaturase and/or Al 2- 

desaturase genes. Such microorganisms include, for example, thoj^e belonging to 
the genera Mortierelia, Conidioboius. Pythium, Phytaphathora. FemctUium, 
Parphyridium. Coidosporium, Mucor, Fusurium, Aspergillus, Rhodotorula, and 
Ensomophtkuyra Within the genus Porphyridium, of particular interest is 

20 Pfjrphyridium cruentum. Within the genus. Mortt^rella. of particular interest are 

Monierella elongata, Monierella exigua, Mortierella hygropMla, Mortierella 
ranumniana, var. angulispora, and MortierMa atpina. Within the genus Mucdr, of 
particular interest are hfucor circinelloidex and Mucorjauanicus. 

DNAs encoding desired dcsaturases can be identified in a variety of ways. 

25 As an eitample, a source of the desired desaturase, for example genomic or cDN A 

libraries from Monierella. is screened with detectable enzymaiically- or 
chemically-synthesized probes, which can be made from DNA. RNA. or hon- 
naiuratly occurring nucleotides, or mixtures thereof. Probes may be enzymatically 
synthesized from DNAs of known desacurases for normal or reduced-stringency 

30 hybridization methods. Oligonucleotide probes also can be used to screen sources 

and can be based on sequences of Known desaiurases, including sequences 
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conserved among known desatufases. or on pcpride sequences obtained firom the 
desired purified proieip. Oligonucleotide probes based on amino acid sequences 
can be degenerate to encompass the degeneracy of the genetic code, or can be 
biased in favor of the prefened codons of the source organism. Oligonucleotides 

5 also can be used as primers for PGR from reverse transcribed mRiNA from a known 

or suspected source: the PCR product can be the ftill length cDNA or can he used lo 
genenue a probe to obtain the desiied full length cDNA. Alternatively, a desired 
protein can be endiely sequenced and total synthesis of a DNA encoding dwt 
polypeptide perfonned. 

10 Once the desired genomic or cDNA has been isolated, ii can be sequenced 

by known methods. It is recognized in th« art dial such methods arc subject to 
errors, such that multiple sequencing of the same region is roudnc and is srill 
expected to lead to measurable rates of mistakes in the resulUng deduced sequence, 
particularly in regions having repeated domains, extensive secondary smicture. or 

1 5 unusual base compositions, such as regions with high GC base content. When 

discrepancies arise, resequencing can be done and can employ special methods. 
Special methods can include altering sequencing conditions by using: different 
temperanires; different enzymes: proteins which alter the ability of oligonucleotides 
to form higher order structures; altered nucleorid^-s such as ITP or methylated 

20 dCTP; different gel composidons. for example adding fomwmide; different primci^i 

or primers located ai different distances from ihe problem region; or different 
templates such as single siranded DNAs. Sequencing of mRNA can also be 
employed. 

For the most part, some or all of the coding sequence far the polypeptide 
25 having desanirase activity is from a natural source. In some slmaiions. however, it 

is desirable to modify all or a portion of the codons. for example, to enhance 
expression, by employing host preferred codons. Host prnfened codons can be 
determined from the codons of highest frequency in the proteins expressed in the 
largest amount in a particular host species of interest. Thus. U»e coding sequence 
30 for a polypeptide having desaturase activity can be syndiesixed in whole or in part. 

All or portions of the DNA also can be synthe-siacd to remove any destabilizing 
sequences or regions of secondary structure which would be present in the 
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t«n«cribed n.RNA. All or po«ions of the DNA «lso can be synthesized to alter the 
base composition to one more pt^fcrable in the desired host cell. Methods for 
»yn.hesi«ng sequences .nd bringing sequences together are well e«ablish.«l tn the 
literatute. In .itro muugenesis and selection, sltenlir^cted mutagenesis, or olfcer 
means can be employed to obtain muutions of oan.rally occurring desaturase genes 
to produce a polypeptide having desatuiase activity in vivo with more desirable 
physical and Unetic parameters for function in the h«»t cell, such as a longer halt- 
life or a higher rate of production of a desired polyunsaturated fiitiy acid. 

Desirable cDNAs have less than 60* A+T composition, preferably less 
than 50% A*T composition. On a localized scale of a sliding window of 20 base 
paint, it is prefemble that there are no localised regions of the cDNA with greau^r 
than 75% A*T composiUon: with a window of 60 base pairs, it is preferable that 
there arc no localized regions of cDN A with greater than 60%. more preferably 
no localized legions with greater than 55% A+T composition. 

Of particular interll^^ans lt!^J^'iIre1^Xi^-^^'*^' 
desaiunse. Al2-desaturase and Al5 desamrase. The gene encoding the Moniernlla 
alpina AS-desaturase can be expressed in transgenic plant.s lo effect greater 
synthesis of ARA from DGLA. Other DN As which arc substantially idcoi.cal m 
sequence to the Mortiereiia alpina A5-desamrase DNA. or which encode 
polypeptides which are substuntially Identical in sequence to the MurtiereUa alpina 
A5-desaturase polypeptide, also can be used. The gene encoding the MoniereUa 
alpina A6-desaturase can be exprci.scd in transgenic plann or animals to effect 
greater symhesis of GLA from linolcic acid or of stearidonic acid (SDA) from 
ALA. Other DNAs which arc subsUntially identical in sequence to the Momerelu, 
alpina A6-desatur,sc DNA. or which encode polypeptides which are substantially 
identical in sequence to the Mortiereiia alpina Ab-desamrase polypepUde, also can 
be used. 

The gene encoding the Aforr,er*M« alpina Al2-desai«rase can be expressed 
30 in transgenic plants to effect greater synthesis of LA from oleic acid. Other BNAs 
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which arc subsundally identical lo .he Morricrella aipina A12-dcsan.rase DNA. or 
which encode polypeptides which .i« 8.*stantially identical to the Morti^rella 
alpina Al2-dcsaturase polypeptide, also can be used. 

By substantially identical in sequence is intended an amino acid sequence 
5 or nucleic acid sequence exhibiting in order of increasing p»ference at least 60*. 

80%. 90% or 95% homology to the Morticnitlu alpina A5-desatun.se amino acid 
sequence or nucleic acid sequence encoding the amino acid sequence. For 
,K)lypeptldes. the length of compansoo $equeiK»s generally is at least 16 amino 
acids, ptefcrably at least 20 amino acids, or most preferably 35 amino acids. For 
to nucleic acids, the length of comparison sequences generally is at least 50 

nueleotides. preferably at least 60 nucleotides, and more preferably at least 75 
nucleotides, and most p«ferably. 1 1 0 nuclcotidev Homology typically is measun:d 
using sequence analysis soflv^are. for example, the Sequence Analysis software 
package of the Genetics Computer Group. University of Wisconsin Biotechnology 
1 3 Center. 1710 University Avenue, Madison. Wisconsin 53705. MEGAlign 

(DNAStar. Inc.. 1228 S. Park .St.. Madison. Wisconsin 53715). and MacVector 
(Oxford Molecular Croup. 2105 S. Bascom Avenue. Suite 200. Campbell. 
California 95008). Such software matches similar sequences by assigning degrees 
of homology to various substitutions, deletions, and other modifications. 
20 Conservative substitutions typically include substitutions within the following 

groups: glycine and alanine; valine, isoleucine and leucine; aspanic acid. glu««n.c 
acid, asparagine. and gluiamine; serine and threonine; lysine and arginine; and 
phenylalanine and tyrosine. Substinitions nwy also be m«te on the basis of 
conserved hydrophobicity or hydtophllicity <Kyte and DooUtUe. /. MoL Bial 157: 
25 105-132, 1982). or on the ba.5is of the ability to assume similar polypeptide 

secondary structure (Chou and Fasman. Adv. EnzymoL 47: 45-148. 1978). 

EXPRESSION OF DESATUBASE GENES 
Once the DNA encoding a desanirase polypeptide has been obtained. It is 
placed in a vector capable of replication in a host cell, or is propagated in vbro by 
30 means of techniques such as PGR or long PCR. Replicating vectors can include 

plasmids, phage, viruses, cosmids and the like. Desirable vectors include those 
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useful for mutagenesis of the gene of interest or for cxprwsion of the gene of 
interest in hose cells. The technique of long PCR has made in vitro propagaiion of 
large constructs possihlc. so that modifications to the gene of interest, such as 
mutagenesis or addition of expression signals, and propagation of the resulting 
S constructs can occur entirely in vitro wlthoui the use of a replicating vector or a 

host cell. 

For expression of a desaturase polypeptide, functional transcriptional and 
cranslational initiation and tertnination regions arc operably linked to the DNA 
encoding the desaturase polypeptide. Transcriptional and translational tniriadon 

10 and termination regions are derived from a variety of nonexdtisive sources. 

including the DN A to be expre.s.sed. genes known or suspected to be capable of 
expression in the desired system, expression vectors, chemical synthesis, or from an 
endogenous locus in a .Host cell. Expression in a plant tissue and/or plant pan 
presents certain efficiencies, particularly where the tissue or part 15 one which Is 

15 easily harvested, such as seed, leaves, fruits, flowers, roots, etc. Expression can be 

targeted to that location within the plant by using specific regulatory seqtiencesi, 
such as those of USPN 5.463.174. USPN A,S>43.674. USPN 5,106.739, USPN 
5,175.095. USPN 5.420.034, USPM 5.188,958, and USPN 5.589.379. 
Alternatively, the expressed protein can be an enzyme which produces a product 

20 which may be incorporated, cither directly or upon further modifications, into a 

fluid fraction from the hi>si plant. In the pre<;ent case, expression of desaturase 
genes, or aniisense desaturase transcripts, can alter the levels of specific PUFAs, or 
derivatives thereof, found in plant parts and/or plant tissues. The A5-dcsamrasc 
polypeptide coding region is expressed cither by itself or with other genes, in order 

25 to produce tissues and/ur plant pans containing higher proponions of desired 

PUFAs or in which the PUFA composition more closely resembles that of human 
breast milk (Prieto ef aL. PCT publication WO 95/24494). The termination region 
can be cterived from the 3* region of the gene from which the initiation region was 
obtained or from a different gene. A large number of termination regions are 

30 known to and have been found to be satisfactory in a variety of hosts from the same 

and different genera and species. The termination region usually is selected more 
as a maner of convenience rather than because of any particular property. 
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The choice of a host cell is influenced in part by the desired PUFA profile 
of the transgenic cell, and the native profile of the host cell. A* an example, for 
production of linoleic acid ftom oleic acid, the DNA sequence used encodes a 
pcilypeptide having 6.12 de«itura« activity, and for production of GLA from 
5 linoleic acid, the DNA sequence used encodes a polypeptide having A6 desaiurase 

aciivicy. Use of a host cell which expiesses A 1 2 desaturase activity and lacks or is 
depleted in A15 desaturase activity, can be used with an expression cassette which 
provides for overexpression of A6 desaiurase alone generally is sufficient to 
provide for enhanced GLA production in the iransgenic cell. Where the host cell 
10 expresses A9 desanirase activity, expression of both a Al 2- and a A6-desaturase can 

provide for enhanced GLA production, in particular instances where expre««ion of 
A6 desaturase activity is coupled with expression of A 12 desaiurase activity, it is 
desirable that the host cell naturally have, or be mutated to have, low Al5 
desawrase activity. Alternatively, a host cell for A6 desaiurase expression may 
IS have, or be mutated to have, high AI2 desaturase activity. 

Expression in a host cell can be accomplished in a transient or stable 
fashion. Transient expression can occur from introduced constructs which cont«in 
expression signals functional in the host cell, but which constructs do not replicate 
and rarely integrate in the host cell, or where the; host cell is not prolifiBratlng. 
20 Transient expression also can be accomplished by inducing the activity of a 

regulatable promoter operably linked to the gene of interest, although such 
inducible systems frequently exhibit a low basal level of expression. Stable 
expression can be achieved by introduction of a construct that c»n integrate into the 
host genome or that «utonomously repiicaies in the host cell. Stable expression of 
25 the gene of interest can be selecled for through the use of a selectable marker 

located on or transfected with the expression consiroct, followed by selection for 
cells expressing the marker. When sable expression results fhwn integraiion. 
integration of constructs can occur randomly within the host genome or can be 
targeted through the use of constructs containing regions of homology with the host 
30 genome sufficient to target recombination with the host locus. Where constructs 



-15- 



wo 99/64616 



PCT/US99/13332 



10 



are targeted to an endogenous locus, all or s«xne of the tranKcriptional and 
translaUonal regulatory legions can be provided by the endogenous locus. 

When increased expression of the desaturase polypeptide in the source plant 
is desired, several mertiods can be employed. Additional genes encoding the 
desaturase polypeptide can be introduced into the host organism. Expression from 
the native desaturase locus also can be increased through homologous 
recombination, for example by inserting a stronger promoter into the hose genome 
to cause increased expression, by removing destabilizing sequences ftom cidier die 
mRNA or die encoded protein by deleting that informadon from the host genome, 
or by adding stabilizing sequences to die mRNA (jsc USPN 4.910,141 and USPN 
5.S00.36S.) 

When it is desirable to express more dian one different gene, aniropriate 
regulatory regions and ncpiession methods, introduced genes can be propagated in 
the host cell through use of replicating vectors or by integration into die host 

1 5 genome. Where two or more genes are expiessed from separate replicaring vectors. 

it is desirable that each vector has a different means of replication. Bach introduced 
construct, whedier integrated or not. should have a different means of selection and 
should Uck homology lo the odier constfucts to maintain stable expression and 
prevent reassortment of elements among constructs. Judicious choices of 

20 regulatory regions, selection means and method of propagation of the introduced 
construct can be experimentally determined so that all introduced genes are 
expressed at die necessary levels to provide for syndesis of the desired products. 

Constructs comprising die gene of interest may be introduced into a host 
cell by standard techniques. These techniques include transfection. infection. 

25 holistic impact, electroporaiion, microinjection, scraping, or any other method 

which introduces the gene of inicrcsc into the host cell USPN 4.743.548. USPN 
4.795.855. USPN 5.068.193. USPN 5.188.958, USPN 5.463.174. USPN 5.565.346 
and USPN 5.565.347). For convenience, a host cell which has been manipulated 
by any method to take up a DNA sequence or construct will be referred to as 

30 "iransformed" or "recombinant" herein. The subject host will have at least have 

one copy of the expression construct and may have two or more, depending upon 
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whether the gene is integraied into the genome, amplified, or is present on an 
extrachromosomal element having multiple copy numbers. 

The transformed host cell can be identified by selection for a marker 
contained on the introduced construct. Alternatively, a separate marker construfet 
may be introduced with the desired conatruct. as many transfomuition techniques 
introduce many DNA molecutes into host cells. Typically, transformed hosts aie 
selected for their ability to grow on selective media. Selective media may 
incorporate an antibiotic or lack a factor nece.«ary for growth of the untrunsfomted 
host such as a nutrient or growth factor. An inaoduced marker gene therefor may 
confer andbiotic msimnce. or encode an essential growth factor or enzyme, and 
permit growth on selective media when expressed in the transformed host cell. 
Desirably. ««stance to kanamycin and the amino glycoside G418 are of interest 
USPN 5.034.322). Selection of a transformed host can also occur when the 
expressed marker protein can be detected, either directly or indirectly. Tt^ marker 
protein may be exptesaed alone or as afiision lo another protein. The marker 
pn^in can be detected by its enzymatic acdvity: for example P galaciosidase can 
convert the sub«rate X-gal to a colored product, and lucifcrase can conven 
lucifcrin to a Hght-emining product. The marker protein can be deu^oed by ,ts 
light-producing or modifying characteristics: for example, the green fluorescent 
20 protein of Ae^uorea victoria fluoresces when illuminated with blue l.ght. 

Antibodies can be used detect the marker protein or a molecular uig on. for 
example, a protein of intere«. Cells expn^ssing the marker protein or tag am be 
selected, for example, visually, or by technique, such as FACS or panning using 
antibodies. 

The PUFAs produced using the subject methods and compositions may be 
found in the host plant tissue and/or plant part a. free fa«y acids or in conjugated 
fom« such as «cylglycerols. phospholipids, sulfolipids or glycolipids. and may be 
extracted fh«n the host cell through a variety of means well-known in *e art. Such 
means may include extraction with organic solvents, sonication. supercritical fluid 
extraction using for example carbon dioxide, and physical means such as presses, 
or combinations thereof Of particular interest is extraction wid, hexane or 
methanol and chloroform. Where desirable, the aqueous layer can be acidified to 
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pr«on«c negatively charged moieties and thereby increase pattitiuning of desired 
products into the organic layer. After extraction, dw organic solvents can be 
removed by evaporarion under a stream of nitrogen. When isolated in conjugated 
forms, the p«^uct. are enzymatica.ly or chemically cleaved to release the free fatty 
5 acid or a less complex conjugate of inter«t. and a« then subjected to ftirther 

manipulation, to produce a desired end product. Desirably, conjugated forms of 
fatty acids are cleaved with potassium hydroxide. 

PURIFICATION OF FATTY ACroS 
If ftirther purification is necessary, standard methods can be empJoyed. 

10 Such methods include extraction, treatment with urea, fractional crysulHzatioo. 

HPLC. fractional distillation, silica gel chromatography, high speed cenlrifiigaiton 
or distillation, or combinations of these techniques. ProtecUcn of re«aive groups, 
such as the acid or alkenyl groups, may be done at any step through known 
techniques, for example alkylation or iodlnation. Methods u«;d include 

15 merttylation of the fatty acids to produce methyl enters. Slmilariy. protecting 

groups may be removed .t any step. Desirably, purification of fractions contttmng 
ARA. DHA and EPA is accomplished by treatment widi urea and/or fractional 
distillation. 
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USES OF FATTY ACIDS 
The uses of the fatty acids of subject invention are several. Probes based on 
(he DNAs of the present invention may find use in methods for isolating rclau^l 
molecules or in methods to detect organisms expressing desao^ses. When used as 
probes, the DNAS or oligonucleotides need to be detectable. This is usually 
accomplished by attaching a label either at an internal site, for example via 
incorporation of a modified re.Mdue. or at the 5' or 3' terminus. Such Ubels can be 
directly detecuble. can bind to a secondary molecule that is detectably labeled, or 
can bind to an unlabelled secondary molecule and a detectably labeled tertiary 
molecule: this process can be extended as long as is practical to achieve a 
satisfactorily detectable signal without unacceptable levels of background signal. 
Secondary, tertiary, or bridging systems can include use of antibodies directed 
agains. any other molecule, including labels or other andbodiCH. or can involve any 
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molecules which bind to each other, for example a bloUn-sirepiavidin/avidin 
system. Detecuble labels typically include radioactive isotopes. n»olcculcs which 
chemically or enzymaiically produce or alter light, enzymes which produce 
detectable reaction prodocis. maeneiic molecules, fluorescent molecules or 

5 molecules whose fluoi«scence or light-emitting characteristics change upon 

binding. Examples of labelling methods can be found in USPN 5.01 1 .770. 
Alternatively, the binding of target molecules can be directly delected by measuring 
the change in heat of solution on binding of probe to target via isothermal Utralion 
calorinKtry. or by coating the probe or target on a surface and detecting the change 

10 in scattering of light from the surface produced by binding of target or probe, 

ffsspectivcly. as may be done with the BIAcore system. 

The invention will be better understood by reference to the following non- 
limiting examples. 

Examples 
K«amnle 1 

», p.««i«, rf c ^ } ^^turase from T »Urt*a^ iH f r«ff«gf BteBtf ' 

The DI5/0-3 aciiviiy of Brassica napus can be incrca.<ed by the expression 
of an a>-3 desaurusc from C tUgans. The fa.- 1 'cDNA done (Genb-nk accession 
20 1>»1807; Spychalla. J. H.. Kinney. A. J., and Browse. J. 1997 P.N.A.S. 94. 1 1*2- 

1 147 was obtained from John Browse at Washington Sttte Univerwty. TTie fiit-l 
cDNA was modified by PGR to introduce clonitig sites using the followms primers. 

Fat-lforward: 

y^ACUAOTACUACTGCAOACAATCCn'CCKrrCATTCCrCAGA-y 

25 FaMrevwse: 

5'- CAUCAUCAUCAUGCGCCCGCrrACTrGGCCnTGCCIT - 3' 

These primers allowed the ampimcaiion of the entire coding rrgion and 
added PstI and NotI sites to the 5'- and 3-end». respectively. The PGR product was 
subcloned into pAMPI (GIBCOBRL) using the CloneAmp system (GIBCOBRL) 
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to cwace pCGN5S62. The sequence was verified by sequencing of both ttrnda to 
be sure no changes were introduced by PGR. For seed specific expression, ihc Fat- 
I coding region was cut out of pCGN3562 as a PsiT/Notl fTugmenl and inserted 
between the PstL^oil sites of the binary vector. pCGN8623. to create pCONS563. 
5 PCGN5563 can be introduced into Brassica napus via yl^rcitecrentfm-mediaied 

transformation. 

Construction of pCGN8623 

The polylinlcer region of the napin promoter cassette. pCGN7770. was 
replaced by Itgattng the following oligonucleotides: 

10 5'- TCGACCTGCAGGAAGCTTGCGGCCGCGGATCC -3' and 

5'-TCGAGGATCCGCGOCCGCAAGCTTCCTGCAGG-3\ These 
oligonucleotides were .\gated into Sa»/Xhol-digested pCGN7770 to produce 
pCGN86l9. These aligos encode BamHI, Noil, Hindlll, and PstI restriction sites. 
pCGN86IO contains the oligos oriented such that the PstI sile is closest to the napin 

1 5 5* regulatory region. A fragment containing the napin 5' regulatory region. 

polylinker. and napin 3' region was removed from pCON86 19 by digestion with 
Asp7 1 8L The fragment was blunt^ded by filling in the 5' overhangs with Klenow 
fragment then ligaied into pCGNS 1 39 that had been digested with Asp7 1 81 and 
Hindlil and blunt-ended by filling in the 5' overhangs with Klenow fragment. A 

20 plasmid containing the insert oriented so thai the napin promoter wa:t closest to the 

blunted A.';p7l 81 site of pCGN5 139 and the napin 3' was closest to the blunted 
Hindin site was subjected to sequence analysis to confirm both the insert 
orientation and the integrity of cloning junctions. The resulting plasmid was 
designated pCGN8623. 

25 To produce high levels of stearidonic acid in Brassica, the C. elegans a>-3 

dcsaturase can be combined with D6- and D12-desanurases firom MornereUa 
aipina. PCGN5S63-transformed plants may be crossed with pCGN5344. 
transformed plants expressing the D6-and Dl2-dcsaiurascs. 

The resulting Fl seeds can be analyzed for stearidonic acid content and 

30 selected Fl plants can be used for self-pollination to produce F2 seed, or as donors 

for production of dihaplotds, or additional crosses. 
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An alternative method to combine the fat-1 cDNA with M. alpinu D6 and 
DI2 deaaturtses is to combine them on one T-DNA for transfoimation. The fat-1 
coding legion from pCaN5562 can be cut out as a Psil/Notl fragment and inserted 
into PstWoll digestod pCGN86l9. The transcriptional unit consisting of the napin 

5 5' regulatory region, the fiit- 1 coding region, and the napin 3 -regulatory region can 

be cut out as a S6e83871 fragment and inserted into pCCN5544 cut with Sse8387l. 
The resulting plasmid would contain three napin transcriptional units containing the 
C. ete^flw w-3 desaturase, M. /j/;>ina D6 desaturase, and M. alpina Dl2desamra.se. 
all orienujd in the same directioti as the 35S/nptllAmI iransciiptional unit used for 

10 Mlection of transformed tissue. 

R«ainnla 2 

The DI5-desaturase activity of Brassica napus can be increased by over- 
expression of the DlS-desaturase cDNA clone. 
j5 a. napus D iS-desaiurase cDNA clone v^as obuined by PCR 

amplification of first-strand cDNA derived from B. napus cv. 2I2/R6. The primers 
>vere based on published sequence: Genbank# L0I418 Arondel et al. 1992 
Science 258:1353-1355. 
The following primers were used: 

20 BndlS-FORWARD 

5'.CUACUACUACUAGAGCTCAGCGATGGTTGTTGCTATaGAC-3* 

BndlS-REVERSE 

5-CAUCAUCAUCAUGAATrCTTAATTGATTTTAGATTTG-3- 

These primers allowed the ampliftcation of the entire coding region and 
25 added SacI and EcoRi sites to the 5 - and 3'-ends. respectively 

The PCR product was subcloned inu> pAMPl (GIBCOBRU using the 
CloneAmp system (GIBCOBRL) to create pCGN5520. The sequence was verified 
by sequencing of both strands to be sure that the open reading frame remained 
intact. 
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For seed specific expression, the D15-<Jesatutase coding region was cut out of 
PCGN5520 as e BamHI/Sall fragment and inserted between the Bglll and Xhol 
sites of the PCGN7770. to create pCGN5357. The PstI fragment of pCGN5S57 
containing the napin S'-reguUioty region. B. napus D15^attm«e. and napin 3'- 
5 regulatory region was inserted into the Psil site of the binaiy vector. pCGN5138 to 

produce pCGN5558. pCGN555» was introdnced into Brassica napus via 
A/rrt*frtfeMrium-incdtated cransformation. 

To produce high levels of stearidonic acid in Brmsica, die DIS-dwaturase 
can bo combiited Wirt* D6. and DI2-desatuiases fiotii M€»rf«r./to 
10 PCGNS558-irai»sfor.ned plants tnay be crossed with pCGNS544.tr«osfonned plants 

expressing the D6 and DI2-desaturases. The resulting Fl seeds can be analyzed for 
stearidonic acid content and selected Fl planes can be used for self-pollination to 
produce F2 seed, or aJ donors for production of dihaploids. or additional crosses. 
An altcmarive mediod to combine the B. napus DlS-desaturase with ht. 
1 5 atpina D6 and D 1 2 desaturases is to combine diem on one T-DN A for 

transfoimation. The transcription cassette consisting of the napin 5'-regulatoiy 
region, the D 1 5-dcsaturasc cKling region, and the napin 3'-regulatory regioit can be 
cut out of pCGN5557 as a Swal fragment and inserted Into Swal-digested 
PCGN5544. The resulting plasmid would contain three napin transcripUonal units 
20 containing the M. alpina D6 desalura,*. the B. napus DlS-dcsaturase. and the 14. 

ulpUta D12 desaturase. all oriented in the same direction as the 35S/nptIVtml 
transcriptional unit used for selection of transformed tissue. 

I ^ffample 3 

25 y vpr^on in Leavea 

Ma29 is apuutive M. alpina D5 desaturase as determined by sequence 
homology. This experiment was designed to determine whether leaves expressing 
N4a29 (as determined by Noithem) were able to convert exogenously applied 
DCLA (20:3) to ARA (20:4). 
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The Ma29 desaturase cDNA was modified by PGR to introduce convenient 
restriction sites for cloning. The desaturase coding region has been inserted inio a 
d35 casiette under the control of the double 3SS promoter for expression in 
Brassica leaves (pCON5523) following standard protocols USPN 5.424.200 
and USPN 5.106.739). Transgenic Brassica plants containing pCGN5525 were 
generated following standard protocols Qs^e USPN 5.i«8.95« and USPN 
5.463.174). 

In the first experiment, three plants were used: a control. I.P004-I. and 
two nansgenica.. 5525-23 and 5525-29. LP004 is a low-Iinolenic Bnu^ica variety. 
Leaves of each wens selected for one of three treatmenis: water. OLA or DOLA. 
OLA and DOLA were purchased as sodium salts from NuChek Prep and dissolved 
in water at I mg/ml. AUquots were capped under N, and stored at -70 degrees C. 
Leaves were treated by applying a SO pi drop to the upper surface and gently 
spreading with a gloved finger to cover the entire surface. Applications were made 
approaimaiely 30 minutes before the end of the light cycle te minimiae any photo- 
oxidadon of the applied faity acids. After6 days of treatment one leaf from each 
weatment was harvested and cut in half through the mid rib. One half was washed 
with water to attempt to remove unincorporated fatty acid. Leaf samples were 
lyophilized overnight, and fany acid composition determined by gas 
20 chromatography <GC). The results are shown in Table I . 
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Leaves treaced whh GLA contained from 1 .56 to 2.4 wt% GLA. The fatty acid 
awdysis showed that the lipid composition of control and transgenic leave* «a, 
essentially the same. Leaves of control plants t«-ied with DOLA contained 1 .2- 
1 9 w% DGLA and background amounts of ARA (.26-.27 wl%). Transgenic leaves 
contaiiwd only .2-7 w,* DGLA. but level, of ARA were increased (.74-l.i 
indicating that die DGLA was convened to ARA In these leaves. 
R«nr«Miion in Seed 

•n»e putpose of this experiment was lo decemune whether a construct wid» 
the seed specific napi" promoter would enable expression in seed. 

The Ma29 cDNA was modified by PCR to introduce Xhol cloning siKS 
upstream and downstream of the start and slop codons. respectively, using the 
following primers: ^ 

Madxho-rorward: 

5.CUACUACUACUACTCGAGCAAGATOOGAACOCACCAAGG 

Madxho-reverse: 

5-CAUCAUCAUCAUCTCOAOCTACTCTTCCTTGGGACGOAG 

The PCR product was subcloned into pAMPl (GIBCOBRL) using the 
CloneAmp system (OIBCOBRU to create pCGN5522 and the AS desaturase 
20 sequence was verified by sequencing of both strands. 

For seed-specific expression, the Ma29 coding region was cut out of 
PCGNSS22 as an Xho\ fragment and inserted into the Safl site of the napin 
expression cassette. pCON3223. to cteate pCON5528. The HmdIII fragment of 
PCGN5528 containing the napin 5' regulatory region, the Ma29 coding region, and 
25 the napin 3' regulatory «gion was inserted into the Hi'«dni site of pCONl5$7 to 

create pCGN553I. Two copies of the napin transcriptional unit were inserted in 
tandem. This tandem constnict can permit higher expression of the desamrases per 
genetic loci. pCONS531 was introduced into Brassica napus cv.LPOO* via 
Agrobacierium mediated transformation. 

The fatty acid composition of twenty-seed pools of mature T2 seeds was 
analyzed by GC. Table 2 shows the results obtained with independent transformed 
lin s as compared to non-transtbrm dLP004seed. The transgenic seeds conuining 

-26- 



wo 99/64616 



PCT/US99/13332 



pCCN5S3 1 cunuun cwo faoy acids that are not present in the conuol seeds, 
tentatively identified as uxoleic acid (S.9-I8;2) and pinolenic acid (5.9,12-18:3). 
based on their elulion iclaiive lo oleic and linoleic acid. These would be the 
expected proUucu of A5 desaiuration of oleic and linoleic acids. No other 
differences in fiitiy acid composition were observed in the transgenic seeds. 



pynrfugfinn of DS.desatur; »tod Fattv Acida to TniP»ff*PiP ITWtte 

The coiMtniction of pCGN553 1 (DS-deaaturase) and faity acid composition 
10 of T2 seed pools is described in Example 3. This example takes the seeds dirough 

one more generation and discusses ways Co maximize the D5-desanirated fatv 
acids. 

Example 3 describes the faity acid composition of T2 seed pools of 
pCGN553 1 -transformed B. napus cv. LP004 plants. To investigate the segregoiion 

15 of DS^aturated fatty acids in the T2 seeds and lo identify Individual plants to be 

taken on to subsequent generations, half-seed analysis was done. Seeds were 
germinated overnight in the dark at 30 degrees on water-soaked filter paper. The 
outer cotyledon was excised for GC analysis and the rest of the seedling was 
planted in soil. Results of some of these analyses are shown in the accompanying 

20 Table 3. D5.9-18:2 accumulated to as high as ^2« of ttie tottl fatty acids and 

D5.9,12-18:3 accumttlatcd to up co 0.77% of the fatty acids. These and other 
individually selected T2 plants were grown in the greenhouse to produce T3 seed. 
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To maximize ihe accumulation of D5.9 1 8:2 in seed oil, the pCX;N553 1 
construct could be introduced into a high oleic acid variety of canola. A high-oleic 
variety could be obtained by mutttion. so-suppression. or antisense .uppreawon of 

5 the D12 and DI5 desaiura^es or other neeessaiy oo-factont. 

To maximize accumulation of D5.9.I2 18:3 in canola. the pCON553l 
construct could be Introduced into a high linoleic strain of canola. This could be 
achieved by crossing pCGNSSai-ttansformed plants with pCGN5542^M. alpina 
Di2-de«iturase) nan-formed plant*. Alternatively, the D5 and Dlldeaaiurasea 

10 cooldbecombinedononeT-DNAfortmnsformaiion. Thetranscriptionalonit 

conaisting of the napln S -«gulatory region, the M. ^ina DI2-desaturaae coding 
region, and the napin T-regulatory region can be cut out of pCGN5541 (described 
in CGAB320> as a NotI fragment. NotI«bal linkers could be ligated and Ihe 
resulting fragment in«ir,ed into the Xbal site of pCGN553 1. TTie resulting plasmid 

1 5 would contain three napin transcriptional units containing the M. alpina Dl 2 

desanirase. and two copies of the napin/M. alphina DS desanirase/napin unit, all 
oriented in the same direction as the 35S/npdi/nnl transcriptional unit used for 
selection of transformed rissue. 

pnanmle S 

20 gy p^P^cinn of M ^fa.t.^ a6 Dr^»tura».e in Braxsi^fi mtPM 

A nucleic acid sequence from a partial cDNA clone. Ma524. encoding a A6 
fiitty acid desaturase from Mortierella alpina was obtained by random sequencing 
of clones from U»e M. alpina cDNA library. The Ma524 cDNA w«. modified by 
PCR to introduce cloning sites using the following primers: 



25 



MiiS24PCR-l 

5-CUACUACUACUATCTAGACTCGAGACCATGGCTGCTCXrT 
CCAQTOTG 
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Ma524PCR-2 

5-.CAUCAUCAUCAUAOGCCTCaAarTACTGCGCCTTACCCAT 

These primers allowed the amplification of the entire coding region and 
added Xbul and Xhnl sites to the 5 -end and Xhol and S,ul *iies lo the 3' end. The 
PGR product was subcloned into pAMPl (GIBCOBRL) using tho ClonoAmp 
system (GIBCOBRL) to crcMo pCGN5535 and the A6 desalunisc sequence was 
verified by *equencing of boch strands. 

For seed-specific expression, the Ma524 coding region was cut out of 
PCON5535 as an Xhol fragment and inierted into the Safl site of the napin 
expicssion cassette. pCGN3223, to create pCGN5S36. The Natl fragment of 
PCGN5S36 containing the napin 5' legulatoiy region, the Ma524 coding legion, 
and the napin 3" regulatory region was inserted into the Notl site of pCON 1557 to 
ereate pCGN5538. pCON553« was introduced into »ra»«t« cvXP004 via 

Agrobacterium mediated transformation. 

Maiunng T2 seeds were collected from 6 independent transformation 
evMits in the gr^nhouse. The faity acid composition of single seeds was analyzed 
by OC. Table 4 shows the results of control LP004 seeds and sis 5538 lines. All of 
the 5538 lines except #» produced seeds containing GLA. Presence of OLA 
segregated in these seeds as is expected for the T2 selfed seed population. In 
addition to GLA. the M. alpina dA desatuiase isl;apable of producing 18:4 
(steaiidonic) and another fatly acid believed to be the 6.9-18:2. 

The above results show that desaturases with three different substrate 
specificities can be expressed in a heterologous system and used to produce poly- 
unsaturated long chain fatty acids. Exemplified we« the production of ARA (20:4) 
25 from the precursor 20:3 (DOLA). the production of GLA ( 1 8:3) from 1 8:2 

substrate, and the conversion of 18: 1 substrate to 18:2. which is the precursor for 
GLA. 
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nf D6.9 i» Cnnala OM 
Example 5 described construction of pCGN5S38 designed lo express ihe M. 
alpina D6 desan.rase In «eds of transgenic canoU. Table 4 in thai example 
showed examples of single seed analyse* from 6 Uidependent tn..«genie ev«»«. 
Significant amoants of OLA v«re produced, in «ldltion to the D6.9 18:2 fiuty M 

A total of 29 independent pCGNSS38-tnuisforTned transeenic plant* of 
low-linolcnic LP004 culUvar were regenerated and grown in the greenhouse. Table 
5 shows die fatty acid composirion of 20-seed pools of T2 seed from each evew. 
Seven of the lines conttined mo« than 2% of the D6.9 1 8:2 in the seed pools. To 
identify and select plants witf. high amounts of Dd,9 1 8:2 to be taken on to 
subsequent generations, half-seed analysis was done. Seeds were germinated 
overnight in the dark aibo degrees on water-soaked filter paper. The ouier 
cotyledon was excised for GC analysis and the rest of the seedling was planted in 
soil. Based on results of faity acid analysis, selected T2 plants were grown in the 
greenhouse to produce T3 seed. The selection cycle was repeated: pools of T3 seed 
were analyzed forD6.9 I8:2.T3 half-seeds were dissected and analyzed, and 
selected T3 plants were grown in the greenhouse to produce T4 seed. Pools of T4 
seed were analyzed for fatty acid composition. Table 5 summarizes the i«ult« of 
this process for line.s derived from one of .he original transgenic events. 5538- 
LP004-2S. Levels of D6.9 1 8:2 have thus been maintained through 3 generations. 

To maximiasc the amount of D6,9 18:2 that could be produced, the 
PCON5538 construct could be introduced into a high oleic acid variety of canola 
either by traa-jformation or cn>!;sing. A high-oleic variety could be obtained by 
muution, co^suppnsssion. or .ntisense suppression of the Dl2 and D15 desaturases 
or uther necessary co-factors. 
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Hf,>.«.,Hn« nf H n tr n tln llY — ^^^'"^^ A^i»r^ from ot Vr orawMmw 

To look for desaiurases involved in PUFA produciion. cDNA libraries were 
constnicied from total RNA isolaicd from Schizochytrium (unknown species - 
proprietary strain supplied by Kclco in Sa« Diego). Pl^mid-baml cDNA libraries 
were constructed in pSPORTI (GIBCO-BRL) following manufMnirW* 
instructions u.ing a con.mcn:lally available kit (GIBCO-BRL). Random cDNA 
clones were sequenced and nucleic acid sequences that encode putMive desamrases 
w«e idenllfied through BLAST search of the daubases and comparison lo known 
D12 and DIS «equences. 

One clone was identified from the ScMvichytri^ library with homolosy to 
both D12 and D15 desaturases; it is called 81-53.A2. The DNA Sequence is 
presented as Seq ID NO:l. The conesponding peptide sequence is presented as 
SEQ ID NO: 2 
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SEQUENCE USTING 
CUGENERAL INFORMATION: 
APPUCANT: KNUTZON. DEBORAH et a1 . 

(ii) TTTLB OF INVB>rnON: POLY-UNSATORATED PATTY AOM IN PLANTS 

ilU) NUMBER OP SEQURNCeS: 



ilv> CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: UMBACH & UMBACH L.UP. 

(B) STREET: 2001 PERRY BUILDING 
CO CITY: SAN FRANCISCO 

15 (D) STATE: CA 

(E) COUNTRY: USA ] 
{F)ZIP: V4III 

(V) COMPtJTBR READABLE FORM: 
20 (A) MEDIUM TYPE: Floppy disk 

(B) OOMPtrreR: IBM PC compuUlile 

(C) OPERATINOSY55TEM: PC-DOS/MS-DOS 

(D) SCNPTWAR& MicruvoA Wvtd 

25 (vl) CURRENT APPUCATION DATA: 

<A) APPUCATION NUMBER: 
(B> FILING DATE; 
(Q CLASSinCATION: 

30 (viu paroR application data: 

(A) APPUCATION NUMBER: 

(B) PILING DATE: 

(vii) PRIOR APPUCATION DATA: 
35 (A) APPLICAmON NUMBER: V$ OR«33,filO 

(8) FlUNG DATE: 1 1-APR-IW7 

(vHi) ATTORNEY/AGENT INFORM A HON; 
(A) NAME: MICHAEL R. WARD 

.40- 
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(B) RB0ISTRAT10N NUMBER: 38 J5l 

(C) REF»RNCEmOCKET NUMBER: CGMO-IOO 

(U) TEUSCOMMUNICATION INFORMATION: 
5 (A) TELEPHONE: (415) 435-4150 

(B) TBUBFAX: <4 1 S) 433-8716 

(C) TELEX: N/A 



10 <2> INFORMATION FOR SEQ ID NO: 1 : 

CACQOA^OCA AOCCrrOACK TCCTTTOCCA JCATOTO^ ^^j^^jlSSS "o 
"^^i^ SSS^SS SS^SSS ^SSSSS SSSicCAC ISO 

III 



i^Jn:^ ^X^^ir;.: T«:xACCoxc ^rcrnrocT 

^S^^SSS Tcccoo^ ^iESifSfiJ. iJSitSi SS^^ 



420 
480 

S40 
600 



020 

7IIC1 

«40 

900 

960 

1020 

1080 

1140 

1200 



4^:^ aS^^^SSS;^ icCCTTXCTT ««CT«^ 

^ Ei?.:? ssss^ ^^^J^ 

t^E SS^oSJ ssss -s^^^. ^^^i 
j^ss^ rcoo^^ 

TC-M««»J crrooeooAc cBTCcroerr eAeTAcrrrc tccctt*cct J^^T^^ 

ZS-I^S^ SreSTTT^C ATOGATSCJUS CATACTCXCC AAOATOTCCC 3CATCTT0OC 
oSSSS iSS^tOGAAC CATT«CACC A^TCCCC "j^TCCT6CC 

M<^«CACACA COCCATCGQA TCCACOC»ca TTOCBCATCA TCTTTTCTOC 

30 ^^^^^ iSS^ 

JSS^ i^cSc i^JS 

cScIaI^ AAACCTCC-rC rAXCATTTCC TCCTTCTAOa^ A?AAT CXCyr TCAT««AC 1340 
CATACAATAT AACTTCATCU CCCCTTCCOS TAATCAATTT OTCTCTCTTT TC 

35 

<i) SEQUENCE CHARACTERISTlCfi: 
(A)LBNCmi: 
CB)TYI»E: wniooicid 
(C) STRANDEDNESS: lingle 
40 (D>TOPOtOCSY:Un4itf 
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(it) MOLECULE TYPE: ONA CgcnonuO 



(Z) INFORMATION FOR SEQ ID NO:l: 
UCaKQKSSKZSSr 

IS 



-42- 
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qiAIMS : 

A method of producing a polyunsaturated fatty acid 
in a host cell comprising the steps of: 

(A) transforming a host cell with a nucleotide 
sequence comprising: 1) an expression cassette 
comprising a transcriptional and translational 
initiation regulatory region, said expression 
cassette being joined in reading frame 5' to 
2) a DNA sequence encoding a desaturase 
polypeptide which modulates the production of 
polyunsaturated fatty acids; and 

(B) culturing said transformed host cell under 
time and conditions sufficient for the 
expression of said desaturase polypeptide in 
said host cell, expression of said desaturase 
polypeptide resulting in production of 
polyunsaturated fatty acids by said host cell. 
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PIAMT FATTY ACID SYMTKASSS AND USE IN IMPROVED METHODS FOR 
PRODUCTION OF MEDIUM-CHAIN FATTY ACIDS 

5 

INTRODUCTION 

Pi,^1<^ nf Invention 

The present invention is directed to genes encoding 

10 plant fatty acid synthase enzymes relevant to fatty acid 
synthesis in plants, and to methods of using such genes in 
combination with genes encoding plant medium-chain 
preferring thioesterase proteins. Such uses provide a 
method to increase the levels of medium-chain fatty acids 

15 that may be produced in seed oils of transgenic plants. 

Background 

Higher plants synthesize fatty acids via a common 
metabolic pathway. In developing seeds, where fatty acids 

20 attached to triglycerides are stored as a source of energy 

for further germination, the fatty acid synthesis pathway is 
located in the plastids . The first step is the formation of 
acetyl -ACP (acyl carrier protein) from acetyl -CoA and ACP 
catalyzed by a short chain preferring condensing enzyme, S- 

25 ketoacyl-ACP synthase (KAS) III. Elongation of acetyl-ACP 
to 16- and 18- carbon fatty acids involves the cyclical 
action of the following sequence of reactions: condensation 
with a two-carbon unit from malonyl-ACP to form a longer S- 
ketoacyl-ACP (S-ketoacyl-ACP synthase) , reduction of the 
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keto-function to an alcohol (S-ketoacyl-ACP reductase) , 
dehydration to form an enoyl-ACP (iS-hydroxyacyl-ACP 
dehydrase) , and finally reduction of the enoyl-ACP to form 
the elongated saturated acyl-ACP (enoyl-ACP reductase) . IS- 
5 ketoacyl-ACP synthase I (KAS I), is primarily responsible 
for elongation up to palmitoyl-ACP (C16:0}, whereas £- 
ketoacyl-ACP synthase II (KAS II) is predominantly 
responsible for the final elongation to stearoyl-ACP 
(C18:0) . 

10 Genes encoding peptide components of 6-ketoacyl-ACP 

synthases I and II have been cloned from a number of higher 
plant species, including castor (JRicinus coirnnxmis) and 
Brassier species (USPN 5,510,255). KAS I activity was 
associated with a single synthase protein factor having an 
15 approximate molecular weight of 50 kD (synthase factor B) 

and KAS II activity was associated with a combination of two 
synthase protein factors, the 50 kD synthase factor B and a 
46 kd protein designated synthase factor A. Cloning and 
sequence of a plant gene encoding a KAS III protein has been 
20 reported by Tai and Jaworski (Plant Physiol. (1993) 
103:1361-1367) . 

The end products of plant fatty acid synthetase 
activities are usually 16- and 18-carbon fatty acids. There 
are, however, several plant families that store large 
25 amounts of 8- to 14 -carbon (medium-chain) fatty acids in 
their oilseeds. Recent studies with Umbellularia. 
californica (California bay) , a plant that produces seed oil 
rich in lauric acid (12:0), have demonstrated the existence 
of a medium-chain-specific isozyme of acyl-ACP thioesterase 
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in the seed plastids. Subsequent purification of the 12:0- 
ACP thioesterase from Umhellularia californica led to the 
cloning of a thioesterase cDlSIA which was expressed in seeds 
of Arabidopsis and Brassica resulting in a substantial 

5 accumulation of lauric acid in the triglyceride pools of 

these transgenic seeds (USPN 5,512,482). These results and 
subsequent studies with medixim-chain thioesterases from 
other plant species have confirmed the chain- length- 
determining role of acyl-ACP thioesterases during de novo 

10 fatty acid biosynthesis (T. Voelker (1996) Genetic 

Engineering, Ed. J. K. Setlow, Vol. 18, pgs. 111-133). 

DESCRIPTION OF THE FIGURES 

Figure 1. DNA and translated amino acid sequence of Cuphea 
15 hookeriana KAS factor B clone chKAS B-2 are provided. 

Figure 2. DNA and translated amino acid sequence of Cuphea 
hookeriana KAS factor B clone chKAS B-31-7 are provided. 
Figure 3 . DNA and translated amino acid sequence of Cuphea 
hookeriana KAS factor A clone chKAS A-2-7 are provided. 
20 Figure 4. DNA and translated amino acid sequence of Cuphea 
hookeriana KAS factor A clone chKAS A-1-6 are provided. 
Figure 5. DNA and translated amino acid sequence of Cuphea 
pullcherrima KAS factor B clone cpuKAS B/7-8 are provided. 
Figure 6. DNA and translated amino acid sequence of Cuphea 
25 pullcherrima KAS factor B clone cpuKAS B/8-7A are provided. 
Figure 7. DNA and translated amino acid sequence of Cuphea 
pullcherrima KAS factor A clone cpuKAS A/p7-6A are provided. 
Figure 8. Preliminary DNA sequence of Cuphea pullcherrima 
KAS factor A clone cpuKAS A/p8-9A is provided. 
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Figure 9. DNA and translated amino acid sequence of Cuphea 
hookerlana KASIII clone chKASIII-27 are provided. 
Figure 10. The activity profile for purified cpuKAS B/8-7A 
using various acyl-ACP substrates is provided. 
Figure 11, The activity profile for purified chKAS A-2-7 
and ChKAS A-1-6 using various acyl-ACP substrates is 
provided . 

Figure 12. The activity profile for purified castor KAS 
factor A using various acyl-ACP substrates is provided. 
Figure 13. The activity profile for purified castor KAS 
factor B using various acyl-ACP substrates is provided. 
Figure 14. A graph showing the number of plants arranged 
according to C8 : 0 content for transgenic plants containing 
CpFatBl versus transgenic plants containing CpFatBl + chKAS 

15 A-2-7 is provided. 

Figure 15. Graphs showing the %C10/%C8 ratios in transgenic 
plants containing ChFatB2 (4804-22-357) and in plants 
resulting from crosses between 4804-22-357 and 5401-9 (chKAS 
A-2-7 plants) are provided. 

Figure 16. Graphs showing the %C10 + %C8 contents in 
transgenic plants containing ChFatB2 (4804-22-357) and in 
plants resulting from crosses between 4804-22-357 and 5401-9 
(ChKAS A-2-7 plants) are provided. 

Figure 17. Graphs showing the %C10/%C8 ratios in transgenic 
plants containing ChFatB2 (4804-22-357) and in plants 
resulting from crosses between 4804-22-357 and 5413-17 (chKAS 
A-2-7 + CpFatBl plants) are provided. 

Figure 18. Graphs showing the %C10 + %C8 contents in 
transgenic plants containing ChFatB2 (4804-22-357) and in 
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plants resulting from crosses between 4804-22-3 57 and 5413-17 
(chKAS A-2-7 + CpFatBl plants) are provided. 
Figure 19. Graphs showing the %C12:0 in transgenic plants 
containing Uc FatBl (LA86DH186) and in plants resulting from 
5 crosses with wild type (X WT) and with lines expressing Ch 
KAS A-2-7. 

Figure 20. Graph showing the relative proportions of C12 : 0 
and C14:0 fatty acids in the seeds of transgenic plants 
containing Uc FatBl (liA8 6DH186) and in plants resulting from 
10 crosses with wild type (X WT) and with lines expressing Ch 
KAS A-2-7 . 

Figure 21. Graphs showing the %C18:0 in transgenic plants 
containing Garm FatBl (5266) and in seeds of plants resulting 
from crosses with wild type (X WT) and with lines expressing 
15 Ch KAS A-2-7. 

Figure 22. The activity profile of Ch KAS A in protein 
extracts from transgenic plants containing Ch KAS A-2-7. 
Extracts were preptreated with the indicated concentrations 
of cerulenin. 

20 

SUMMARY OF THE INVENTION 

By this invention, compositions and methods of use 
related to S-ketoacyl-ACP synthase (KAS) are provided. Also 
of interest are methods and compositions of amino acid and 
25 nucleic acid sequences related to biologically active plant 
synthase (s) • 

In particular, genes encoding KAS protein factors A and 
B from Cuphea species are provided. The KAS genes are of 
interest for use in a variety of applications, and may be 
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used to provide synthase I and/or synthase II activities in 
transformed host cells, including bacterial cells, such as 
£. coll, and plant cells. Synthase activities are 
distinguished by the preferential activity towards longer 
5 and shorter acyl-ACPs as well as by the sensitivity towards 
the KAS specific inhibitor, cerulenin. Synthase protein 
preparations having preferential activity towards medium 
chain length acyl-ACPs are synthase I-type or KAS I- The 
KAS I class is sensitive to inhibition by cerulenin at 
10 concentrations as low as IjiM. Synthases having preferential 
activity towards longer chain length acyl-ACPs are synthase 
Il-type or KAS II. The KAS enzymes of the Il-type are also 
sensitive to cerulenin, but at higher concentrations (50jiM) . 
Synthase Ill-type enzymes have preferential activity towards 
15 short chain length acyl-ACPs and are insensitive to 
cerulenin inhibition . 

Nucleic acid sequences encoding a synthase protein may 
be employed in nucleic acid constructs to modulate the 
amount of synthase activity present in the host cell, 
20 especially the relative amounts of synthase I-type, synthase 
Il-type and synthase Ill-type activity when the host cell is 
a plant host cell- In addition, nucleic acid constructs may 
be designed to decrease expression of endogenous synthase in 
a plant cell as well. One example is the use of an anti- 
25 sense synthase sequence under the control of a promoter 

capable of expression in at least those plant cells which 
normally produce the enzyme. 

Of particular interest in the present invention is the 
coordinate expression of a synthase protein with the 
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expression of thioesterase proteins. For example, 
coordinated expression of synthase factor A and a mediiim- 
chain thioesterase provides a method for increasing the 
level of medium-chain fatty acids which may be harvested 

5 from transgenic plant seeds. Furthermore, coordinated 

expression of a synthase factor A gene with plant medium- 
chain thioesterase proteins also provides a method by which 
the ratios of various medium-chain fatty acids produced in a 
transgenic plant may be modified. For example, by 

10 expression of a synthase factor A. it is possible to 

increase the ratio of C10/C8 fatty acids which are produced 
in plant seed oils as the result of expression of a 
thioesterase having activity on C8 and CIO fatty acids. 

15 DBTAXIiED DESCRIPTION OF THE IHVENTION 

A plant synthase factor protein of this invention 
includes a sequence of amino acids or polypeptide which is 
required for catalyzation of a condensation reaction between 
an acyl-ACP having a chain length of C2 to Ci6 and malonyl- 

20 ACP in a plant host cell. A particular plant synthase 
factor protein may be capable of catalyzing a synthase 
reaction in a plant host cell (for example as, a monomer or 
homodimer) or may be one component of a multiple peptide 
enzyme which is capable of catalyzing a synthase reaction in 

25 a plant host cell, i.e. one peptide of a heterodimer. 

Synthase I (KAS I) demonstrates preferential activity 
towards acyl-ACPs having shorter carbon chains, C2-C14 and 
is sensitive to inhibition by cerulenin at concentrations of 
IMM. Synthase II (KAS II) demonstrates preferential 
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activity towards acyl-ACPs having longer carbon chains, C14- 
Ci6, and is inhibited by concentrations of cerulenin (50J1M) • 
Synthase-.III demonstrates preferential activity towards 
acyl-CoAs having very short carbon chains, C2 to Cs, and is 
insensitive to inhibition by cerulenin- 

Synthase factors A and B, and synthase III proteins 
obtained from medium-chain fatty acid producing plant 
species of the genus Cuphea are described herein. As 
described in the following Examples, synthase A from C. 
hookeriana is naturally expressed at a high level and only 
in the seeds. C. hookeriana synthase B is expressed at low 
levels in all tissues examined. Expression of synthase A 
and synthase B factors in E. coli and purification of the 
resulting proteins is employed to determine activity of the 
various synthase factors. Results of these analyses 
indicate that synthase factor A from Cuphea hookeriana has 
the greatest activity on 6:0-ACP substrates, whereas 
synthase factor B from Cuphea pullcherrima has greatest 
activity on 14:0-ACP. Similar studies with synthase factors 
A and B from castor demonstrate similar activity profiles 
between the factor B synthase proteins from Cuphea and 
castor. The synthase A clone from castor, however, 
demonstrates a preference for 14:0-ACP substrate. 

Expression of a Cuphea hookeriana KAS A protein in 
25 transgenic plant seeds which normally do not produce medium- 
chain fatty acids does not result in any detectable 
modification of the fatty acid types and contents produced 
in such seeds. However, when Cuphea hookeriana KAS A 
protein is expressed in conjunction with expression of a 
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medium-chain acyl-ACP thioesterase capable of providing for 
production of C8 and CIO fatty acids in plant seed oils, 
increases in the levels of medium-chain fatty acids over the 
levels obtainable by expression of the medium-chain 
5 thioesterase alone are observed. In addition, where 

significant amounts of C8 and CIO fatty acids are produced 
as the result of medixim-chain thioesterase expression, co- 
expression of a Cuphea KAS A protein also results in an 
alteration of the proportion of the C8 and CIO fatty acids 
10 that are obtained. For example, an increased proportion of 
CIO fatty acids may be obtained by co-expression of Cuphea 
hookeriana ChFatB2 thioesterase and a chKAS A synthase 
factor proteins. 

Furthermore, when Cuphea. hookeriana KAS A protein is 
15 expressed in conjunction with expression of a medium-chain 
acyl-ACP thioesterase capable of providing for production of 
C12 fatty acids in plant seed oils, increases in the levels 
of medium-chain fatty acids over the levels obtainable by 
expression of the medium-chain thioesterase alone are also 
20 observed. In addition, where significant amounts of C12 and 
C14 fatty acids are produced as the result of medium-chain 
thioesterase expression, co-expression of a Cuphea KAS A 
protein also results in an alteration of the proportion of 
the C12 and C14 fatty acids that are obtained. For example, 
25 an increased proportion of C12 fatty acids may be obtained 
by co-expression of Uc FatBl thioesterase and a chKAS A 
synthase factor proteins . 

However, when Cuphea hookeriana KAS A protein is 
expressed in conjunction with the expression of a long-chain 
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acyl-ACP thioesterase capable of providing for production of 
C18 and C18:l fatty acids in plant seed oils, no effect on 
the production of long chain fatty acids was observed. 
Furthermore, when plants transformed to express a long chain 

5 acyl-ACP thioesterase from mangosteen (GarmFatAl, U.S. 
Patent Application No. 08/440,845), which preferentially 
hydrolyzes C18:0 and C18:l fatty acyl-ACPs, are crossed with 
nontrans formed control plants, a significant reduction in 
the levels of C18:0 is obtained. Similar reductions are also 

10 observed in the levels of C18:0 in the seeds of plants 

resulting from crosses between plants transformed to express 
the GajrmFatAl and plants expressing the Cuphea. hookeriana 
KAS A protein. 

Thus, the instant invention provides methods of 

15 increasing and/or altering the medi\im-chain fatty acid 

compositions in transgenic plant seed oils by co-expression 
of medium-chain acyl-ACP thioesterases with synthase factor 
proteins- Furthermore, various combinations of synthase 
factors and medium-chain thioesterases may be achieved 

20 depending upon the particular fatty acids desired. For 
example, for increased production of C14 fatty acids, 
synthase protein factors may be expressed in combination 
with a C14 thioesterase, for example from Cuphea palustris 
or nutmeg may be employed (WO 9 6/23892) . In addition, 

25 thioesterase expression may be combined with a number of 

different synthase factor proteins for additional effects on 
medi\am-chain fatty acid composition. 

Synthases of use in the present invention include 
modified amino acid sequences, such as sequences which have 
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been mutated, truncated, increased and the like, as well as 
such sequences which are partially or wholly artificially 
synthesized- The synthase protein encoding sequences 
provided herein may be employed in probes for further 
5 screening or used in genetic engineering constructs for 
transcription or transcription and translation in host 
cells, especially plant host cells. One skilled in the art 
will readily recognize that antibody preparations, nucleic 
acid probes (DNA and RNA) and the like may be prepared and 
10 used to screen and recover synthases and/or synthase nucleic 
acid sequences from other sources. Typically, a 
homologously related nucleic acid sequence will show at 
least about 60% homology, and more preferably at least about 
70% homology, between the communis synthase and the given 
15 plant synthase of interest, excluding any deletions which 
may be present. Homology is determined upon comparison of 
sequence information, nucleic acid or amino acid, or through 
hybridization reactions. 

Recombinant constructs containing a nucleic acid 
20 sequence encoding a synthase protein factor or nucleic acid 
sequences encoding a synthase protein factor and a medium- 
chain acyl-ACP thioesterase may be prepared by methods well 
known in the art. Constructs may be designed to produce 
synthase in either prokaryotic or eukaryotic cells. The 
25 increased expression of a synthase in a plant cell, 

particularly in conjunction with expression of medium-chain 
thioesterases, or decreasing the amount of endogenous 
synthase observed in plant cells are of special interest. 



wo 98/46776 



12 



PCTAJS98/07114 



10 



15 



20 



25 



Synthase protein factors may be used, alone or in 
combination, to catalyze the elongating condensation 
reactions of fatty acid synthesis depending upon the desired 
result. For example, rate influencing synthase activity may 
reside in synthase I-type, synthase Il-type, synthase Ill- 
type or in a combination of these enzymes. Furthermore, 
synthase activities may rely on a combination of the various 
synthase factors described herein. 

Constructs which contain elements to provide the 
transcription and translation of a nucleic acid sequence of 
interest in a host cell are "expression cassettes". 
Depending upon the host, the regulatory regions will vary, 
including regions from structural genes from viruses, 
plasmid or chromosomal genes, or the like. For expression 
in prokaryotic or eukaryotic microorganisms, particularly 
unicellular hosts, a wide variety of constitutive or 
regulatable promoters may be employed! Among 

transcriptional initiation regions which have been described 
are regions from bacterial and yeast hosts, such as E. coll, 
B. suhtilis, Saccharomyces cerevisiae, including genes such 
as S-galactosidase, T7 polymerase, trp-lac (tac) , trp E and 
the like. 

An expression cassette for expression of synthase in a 
plant cell will include, in the 5* to 3 • direction of 
transcription, a transcription and translation initiation 
control regulatory region (also known as a "promoter") 
functional in a plant cell, a nucleic acid sequence encoding 
a synthase, and a transcription termination region. 
Numerous transcription initiation regions are available 
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which provide for a wide variety of constitutive or 
regulatable, e.g., inducible, transcription of the 
desaturase structural gene. Among transcriptional 
initiation regions used for plants are such regions 
associated with cauliflower mosaic viruses (35S, 19S) , and 
structural genes such as for nopaline synthase or mannopine 
synthase or napin and ACP promoters, etc. The 
transcription/ translation initiation regions corresponding 
to such structural genes are found immediately 5' upstream 
to the respective start codons . Thus, depending upon the 
intended use, different promoters may be desired. 

Of special interest in this invention are the use of 
promoters which are capable of preferentially expressing the 
synthase in seed tissue, in particular, at early stages of 
seed oil formation. Examples of such seed-specific promoters 
include the region immediately 5' upstream of a napin or 
seed ACP genes such as described in USPN 5,420,034, 
desaturase genes such as described in Thompson et al {Proc. 
Nat. Acad. Sci. (1991) 88 : 2578-2582 ) , or a Bce-4 gene such 
as described in USPN 5,530,194. Alternatively, the use of 
the 5' regulatory region associated with the plant synthase 
structural gene, i.e., the region immediately 5' upstream to 
a plant synthase structural gene and/ or the transcription 
termination regions found immediately 3 ' downstream to the 
plant synthase structural gene, may often be desired. In 
general, promoters will be selected based upon their 
expression profile which may change given the particular 
application . 
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In addition, one may choose to provide for the 
transcription or transcription and translation of one or 
more other sequences of interest in concert with the 
expression or anti-sense of the synthase sequence, 
5 particularly medi\am-chain plant thioesterases such as 

described in USPN 5,512,482, to affect alterations in the 
amounts and/or composition of plant oils. 

When one wishes to provide a plant transformed for the 
combined effect of more than one nucleic acid sequence of 
10 interest, a separate nucleic acid construct may be provided 
for each or the constructs may both be present on the same 
plant transformation construct. The constructs may be 
introduced into the host cells by the same or different 
methods, including the introduction of such a trait by 
15 crossing trsmsgenic plants via traditional plant breeding 
methods, so long as the resulting product is a plant having 
both characteristics integrated into its genome. 

Normally, included with the DNA construct will be a 
structural gene having the necessary regulatory regions for 
20 expression in a host and providing for selection of 

transformed cells. The gene may provide for resistance to a 
cytotoxic agent, e.g. antibiotic, heavy metal, toxin, etc., 
complementation providing prototrophy to an auxotrophic 
host, viral immunity or the like. Depending upon the number 
25 of different host species into which the expression 

construct or components thereof are introduced, one or more 
markers may be employed, where different conditions for 
selection are used for the different hosts. 
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10 



The manner in which the DNA construct is introduced 
into the plant host is not critical to this invention. Any 
method which provides for efficient transformation may be 
employed. Various methods for plant cell transformation 
include the use of Ti- or Ri-plasmids, microinjection, 
electroporation, liposome fusion, DNA bombardment or the 
like- In many instances, it will be desirable to have the 
construct bordered on one or both sides by T-DNA, 
particularly having the left and right borders, more 
particularly the right border. This is particularly useful 
when the construct uses A, tumefaciens or A. rhizogenes as a 
mode for transformation, although the T-DNA borders may find 
use with other modes of transformation. 

The expression constructs may be employed with a wide 
variety of plant life, particularly plant life involved in 
the production of vegetable oils. These plants include, but 
are not limited to rapeseed, peanut, sunflower, saf flower, 
cotton, soybean, corn and oilseed palm. 

For transformation of plant cells using Agrohacterium, 
explants may be combined and incubated with the transformed 
Agroh&cterium for sufficient time for transformation, the 
bacteria killed, and the plant cells cultured in an 
appropriate selective medium. Once callus forms, shoot 
formation can be encouraged by employing the appropriate 
25 plant hormones in accordance with known methods and the 
shoots transferred to rooting medium for regeneration of 
plants. The plants may then be grown to seed and the seed 
used to establish repetitive generations and for isolation 
of vegetable oils. 
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The invention now being generally described, it will be 
more readily understood by reference to the following 
examples which are included for purposes of illustration 
only and are not intended to limit the present invention. 

5 

lavam plft 1 Cuphea KAS Factor A and B Gene Clonincr 

Total RNA isolated from developing seeds of Cuphea 

10 hookerlana and Cuphea pullcherrima was used for cDNA 
synthesis in commercial 1-based cloning vectors. For 
cloning each type of KAS gene, approximately 400,000-500,000 
unamplified recombinant phage were plated and the plaques 
transferred to nitrocellulose. For KAS factor B cloning 

15 from C. hooker iana, a mixed probe containing Brassica napus 
KAS factor B and Ricinus communis (Castor) KAS factor B 
radiolabeled cDNA's was used. Similarly, a mixed probe 
containing Brassica napus KAS factor A and Ricinus communis 
KAS factor A cDNA clones was used to obtain C- hookeriana 

20 KAS factor A genes. For KASIII, a spinach KASIII cDNA 
clone obtained from Dr. Jan Jaworski was radiolabeled and 
used as a probe to isolate a KASIII clone from C. 
hookeriana. For KAS B and KAS A cloning from C. 
puIIcJierriina, C. hookeriana KAS B and KAS A genes chKAS B-2 

25 and chKAS A-2-7 (see below) were radiolabeled and used as 
probes . 

DNA sequence and translated amino acid sequence for 
Cuphea KAS clones are provided in Figures 1-9. Cuphea 
hoo;ceriana KAS factor B clones chKAS B-2 and chKAS B-31-7 
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are provided in Figures 1 and 2. Neither of the clones is 
full length. Cuphea hookeriana KAS Factor A clones chKAS A- 
2-7 and chKAS A-1-6 are provided in Figures 3 and 4. chKAS 
A-2-7 contains the entire encoding sequence for the KAS 

5 factor protein. Based on comparison with other plant 

synthase proteins, the transit peptide is believed to be 
represented in the amino acids encoded by nucleotides 125- 
466. chKAS A-1-6 is not a full length clone although some 
transit peptide encoding sequence is present. Nucleotides 

10 1-180 represent transit peptide encoding sequence, and the 
mature protein encoding sequence is believed to begin at 
nucleotide 181. 

Cuphea. pullcherrxma KAS factor B clones cpuKAS B/7-8 
and CpuKAS B/8-7A are provided in Figures 5 and 6. Both of 

15 the clones contain the entire encoding sequences for the KAS 
factor B proteins. The first 35 amino acids of cpuKAS B/7-8 
are believed to represent the transit peptide, with the 
mature protein encoding sequence beginning at nucleotide 
233. The first 39 amino acids of cpuKAS B/8-7A are believed 

20 to represent the transit peptide, with the mature protein 
encoding sequence beginning at nucleotide 209. Cuphea 
pullcherrlma KAS factor A clones cpuKAS A/p7-6A and cpuKAS 
A-p8-9A are provided in Figures 7 and 8. Both of the clones 
contain the entire encoding sequences for the KAS factor A 

25 proteins. Translated amino acid sequence of cpuKAS A/p7-6A 
is provided. The mature protein is believed to begin at the 
lysine residue encoded 595-597, and the first 126 amino 
acids are believed to represent the transit peptide. The 
DNA sequence of KAS A clone cpuKAS A-p8-9A is preliminary. 
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Further analysis will be conducted to determine final DNA 
sequence and reveal the amino acid sequence encoded by this 
gene . 

DNA and translated amino acid sequence of Cuphea 
hookeriana KASIII clone chKASIII-27 is provided in Figure 9. 
The encoding sequence from nucleotides 37-144 of chKASIII-27 
are believed to encode a transit peptide, and the presumed 
mature protein encoding sequence is from nucleotides 145- 
1233. 

Deduced amino acid sequence of the C. hookeriana KAS 
factor B and KAS factor A cDNA's reveals strong homology to 
the Brassica napus and Rlcizius communis clones previously 
reported. The C. hookeriana KAS factor B clone is more 
homologous to the Ricinus and Brassica KAS factor B clones 
(94% and 91% respectively) than it is to the Ricinus and 
Brassica KAS factor A clones (60% for both) . Furthermore, 
the C. hookeriana KAS factor A clone is more homologous to 
the Ricinus and Brassica KAS factor A clones (85% and 82% 
respectively) than it is the Ricinus and Brassica KAS factor 
B clone (60% for both) , The C. hookeriana KAS factor B 
cDNAs designated as chKAS B-2 and chKAS B-31-7 are 96% 
identical within the mature portion of the polypeptide - 
Similarly, the deduced amino acid sequence of the mature 
protein regions of the C. hookeriana KAS factor A clones 
chKAS A-2-7 and chKAS A-1-6 are 96% identical. The C. 
pullcherrima KAS clones also demonstrate homology to the R. 
communis and Brassica napus KAS clones. The mature protein 
portion of all of the KAS factor A family members in the 
different Cuphea species are 95% identical. Similarly the 
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mature protein portion of the KAS factor B genes in Cuphe^ 
are also 95-97% identical with each other. However there is 
only approximately 60% sequence identity between KAS factor 
B and KAS factor A clones either within the same or 
5 different species of Cuphea. 

Tgyam pift 2 Levels and Pat:eerns of Expression 

To examine tissue specificity of KAS expression in 
Cuphea hookeriana. Northern blot analysis was conducted 

10 using total RNA isolated from seed, root, leaf and flower 
tissue. Two separate but identical blots were hybridized 
with either chKAS B-31-7 or chKAS A-2-7 coding region 
probes. The data from this RNA blot analysis indicate that 
KAS B is expressed at a similar level in all tissues 

15 examined, whereas KAS A expression is detected only in the 
seed. These results also demonstrate a different level of 
expression for each of the synthases. KAS A is an abundant 
message, whereas KAS B is expressed at low levels. 
Furthermore, even under highly stringent hybridization 

20 conditions (65_C, 0.1 X SSC, 0.5% SDS) , the KAS A probe 

hybridizes equally well with two seed transcripts of 2.3 and 
1.9 kb. The larger hybridizing band is likely the 
transcript of the KAS A-2-7 gene since the size of its cDNA 
is 2 046bp, and the number of clones obtained from cDNA 

25 screening corresponds well with the apparent mobility of the 
mRNA and its abundance on the blot. 
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g^siwpTft 3 Eaqpression of Plant K3U3 Genes in E.coll 

DNA fragments encoding the mature polypeptide of the 
Cuphea hookeriana KAS A cDNAs and the Cuphea pullcherrima 
5 KAS B cDNAs were obtained by PGR and cloned into a 

QIAexpress expression vector (Qiagene) . Experimental 
conditions for maximum level of expression were determined 
for all of these clones and the parameters for highest level 
of soluble fraction were identified. Cells are grown in 
10 ECLB media containing IM sorbitol and 2.5 mM betaine 

overnight and subcultured as a 1:4 dilution in the same 
medium. Cells are then grown for 2 hours (to approximately 
.6-. 8 O.D.) and induced with 0 . 4 mM IPTG and allowed to grow 
for 5 more hours. 
15 Enzyme activity of the affinity purified recombinant 

enzymes obtained from over-expression of the chKAS A-2-7 and 
cpuKAS B/8-7A clones was measured using a wide range of 
acyl-ACP substrates (6:0- to 16:1-ACP). The activity 
profile for cpuKAS B/8-7A is provided in Fig. 10. The data 
20 demonstrate that the enzyme is active with all acyl-ACP 

substrates examined, although activity on 6:0 to 14:0-ACP 
substrates is substantially greater than the activity on 
16:0 and 16:1 substrates. 

The activity profile of the C. hookeriana KAS A clones 
25 chKAS A-2-7 and chKAS A-1-6 is provided in Figure 11. The C. 
hookeriana KAS A clones are most active with C:6, and have 
the least activity with C:16:0 substrates. However, the 
activity of this clone on even the preferred C6:0 substrate 
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is 50 fold lower than the activity of the C- pullcherrima 
KAS B clones. 

A fragment containing the mature protein encoding 
portion of 3l R. communis KAS factor A clone was also cloned 

5 into a QIAexpress expression vector, expressed in E. coli 
and the enzyme affinity purified as described above. The 
activity profile for castor KAS A is provided in Figure 12. 
Highest activity is observed with C14:0 substrates, although 
some activity is also seen with C6:0 and C16:l. In 

10 comparison, the activity profile obtained from purified R. 
communis KAS factor B also using the QIAexpress expression 
system is provided in Figure 13 . The KAS B clone 
demonstrates substantially higher levels of activity (10 
fold and higher) than the R. communis KAS A clone- The 

15 preference of the KAS factor B for 6:0- to 14:0-ACP 

substrates is consistent with the previous observations that 
this protein provides KAS I activity. 

Ryam pie A KAS and TE Expression in Transgenic Seed 

20 Both the CpFatBl (C. hookeriana thioesterase cDNA; 

Dehesh et al. (1996) Plant Physiol. 110:203-210) and the 
chKAS A-2-7 were PGR amplified, sequenced, and cloned into a 
napin expression cassette. The napin/cp FatBl and the 
napin/KAS A-2-7 fusions were ligated separately into the 

25 binary vector pCGN1558 (McBride and Summerfelt {PI .Mol . Biol . 
(1990) 14:269-276) and transformed into A. tumefacienSr 
EHAIOI. The resulting CpFatBl binary construct is pCGN5400 
and the chKAS A-2-7 construct is pCGN5401. Agrohac Cerium 
mediated transformation of a Brassica napus canola variety 
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was carried out as described by Radke et ai - (Theor. Appl . 
Genet. (1988) 75:685-694; PlaUit Cell Reports (1992) 
11:499-505). Several transgenic events were produced for 
each of the pCGN5400 and pCGN5401 constructs. 

5 A double gene construct containing a napin/cpFatBl 

expression construct in combination with a napin/chKAS A-2-7 
expression construct was also assembled, ligated into a 
binary vector and used for co-cultivation of a canola 
Brassica variety. The binary construct containing the 

10 chFatBl and chKAS A-2-7 expression constructs is pCGN5413 . 

Fatty acid analysis of 2 6 transgenic lines containing 
chKAS A-2-'7 (5401 lines) showed no significant changes in 
the oil content or profile as compared to similar analyses 
of wild type canola seeds of the transformed variety. 

15 Fatty acid analysis of 36 transgenic lines containing 

cpFatBl (5400 lines) showed increased levels of C:8 and C:10 
in transgenic seeds. The highest level of C:8 observed in a 
pool seed sample was 4.2 mol% . The C:10 levels were between 
30 and 3 5% of the C:8 content. Fatty acid analysis of 2 5 

20 transgenic lines containing the TE/KAS A tandem (5413 lines) 
demonstrated an overall increase in both C:8 and C:10 levels 
relative to those observed with TE containing lines (5400) 
alone. In lines containing the cpFatBl construct alone, the 
average level of C:8 average were 1.5 mol%, whereas the C:8 

25 average levels in TE/KAS A tandem containing lines was 2.37 
mol%. The ratio of C:8 to C:10 remained constant in both 
populations. The number of transgenic events relative to 
the C:8 content are presented in Figure 14. These data show 
that the transgenic events with tandem TE/KAS A construct 
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yield more lines with higher levels of C:8 than those events 
with single TE construct. For example, several lines 
containing nearly 7 mole% C8 were obtained with the TE/KAS A 
PCGN5413 construct, whereas the highest C8 containing line 
5 from the pCGN5400 TE alone transformation contained 4.2 
mole% CS- 

Half seed analysis of the T3 generation of transgenic 
canola plants expressing a ChFatB2 (C. hookeriana 
thioesterase; Dehesh et al . (1996) The Plant Journal 9:161- 
10 172) indicate that these plant can accumulate up to 22 

weight% (33 mol%) of 8:0 and 10:0 fatty acids (4804-22-357). 
Segregation analysis shows that these transf ormants contain 
two loci and that they are now homozygous. Selected plants 
grown from these half seeds were transferred into the 
15 greenhouse and later crossed with Tl transf ormants that had 
been transformed with either Cuphea hookeriana KAS A (5401) 
alone or KAS A/CpFatBl double constructs (5413) . 

Fatty acid analysis of several events resulting from 
the crosses between transgenic lines containing ChFatB2 
20 (4804-22-357) and chKAS A-2-7 (5401-9), reveal an increase 
in the ratio of C:10/C:8 levels (Figure 15). This C:10/C:8 
ratio in nearly all of the transgenic events containing 
ChFatB2 TE alone fluctuates between 3 and 6, whereas in the 
Fl generation of transgenic containing both the TE and the 
25 KAS A-2-7, the ratio can be as high as 22. This increase in 
C:10 levels is accompanied by an increase in the total C:8 
and C:10 content (Figure 16). The sum of the C:8 and C:10 
fatty acids in the heterozygous Fl lines is as high as those 
in the homozygous parent line (4804-22-357), whereas the 
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heterozygous lines usually contain substantially less C:8 
and C:10 than the homozygous lines . 

Similar results were observed in Fl generation seeds 
resulting from crosses performed between 4804-22-357 

5 {ChFatB2) and the 5413-17 event (CpFatBl and chKAS A-2-7 
tandem). Levels of C:8 and C:10 in the 5413-17 line were 
6.3 and 2.8 mol% respectively. Data presented in Figure 17 
show that there is shift towards C:10 fatty acids as was 
observed with the 4804-22-357 (ChFatB2) x 5401-9 (chKAS A-2- 

10 7) crosses. Furthermore, Figure 18 indicates the presence 
of two separate populations of heterozygotes . Those 
containing approximately 9-11 weight percent C:10 + C:8 are 
believed to represent offspring containing a single copy of 
the ChFatBl TE gene and no copies of the CpFatBl and chKAS A 

15 genes from 5413. Those plants containing approximately 15- 
20 weight percent C:10 + C:8 are believed to represent the 
heterozygotes containing a single ChFatBl TE gene as well as 
the CpFatBl and chKAS A genes from 5413. Thus, the level of 
the C:10 + C:8 fatty acids does not decrease to 50% of that 

20 detected in parent lines when a copy of the ChKAS A gene is 
present . 

To further characterize the chain length specificity of 
the CuphesL hookeriana KAS A enzyme, crosses between 
transgenic Brassica napus lines containing a California Bay 
25 (C/mJbelluIaria callfornica) 12:0 specific thioesterase , Uc 
FatBl (USPN 5,344,771) and chKAS A-2-7 (5401-9) were made. 
Half seed analysis of transgenic plants containing Uc fatBl 
have previuosly indicated that these plants can accumulate 
up to 52 mol% C12:0 in the seed oil of homozygous dihaploid 
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lines (LA86DH186) . Crosses between the line LA86DH186 and 
untrans formed control Brassica demonstrated a decrease in 
the C12:0 levels. 

However, crosses between LA86DH186 and the 5401-9 
5 hemizygous line led to an accumulation of up to 57 mol% 
C12:0 in the seed oil of Fl progeny (Figure 19). 
Interestingly, in crosses with LA86DH186 x untransf ormed 
control line and LA86DH186 x 5401^9, levels of C14 : 0 in the 
seeds of the Fl progeny decreased to 50% of the levels 
10 obtained in homozygous LA86DH186 lines (Figure 20) . 

Furthermore, increases in the proportion of C12 : 0 fatty acid 
resulted in a substantial decline in the proportions of all 
the long-chain fatty acyl groups (C16:0, C18:0, C18:2, and 
C18:3)- These results indicate that the ChKAS A-2-7 is an 
15 enzyme with substrate specificity ranging from C6:0 to 

C10:0-ACP, and that its over -express ion ultimately reduces 
the longer chain acyl-ACP pools. 

Further evidence is obtained in support of the chain 
length specificity of the ChKAS A-2-7 in crosses of the 
20 5401-9 line with a transgenic line (5266) expressing an 

18:1/18:0 TE from G&rcinia mangostana (GarmFatAl, US patent 
application No. 08/440,845). Transgenic Brassica line 5266 
has been shown to accumulate up to 24 mol% C18:0 in the seed 
oil of homozygous lines (Figure 21) . However, in the seed 
25 oil of Fl progeny of crosses between 52 6 6 and 5401-9 levels 
of C18:0 were reduced to approximately 12 mol%. 
Furthermore, levels of C16:0 generated from these crosses 
was similar to the levels obtained from the seed oil of 
nontransgenic control plants - 
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g^am pift 5 xn vitro Analysis of Plant: KAS Enzymes 

Seed extracts were prepared from developing seeds of 
nontransgenic controls or transgenic Brassica expressing 
chKAS A-2-7 as described in Slabaugh et al. (Plant Journal, 

5 1998 in press) and Leonard et al . {Plant Journal, 1998, in 
press) . In vitro fatty acid synthesis assays were performed 
as described by Post-Beittenmiller {J. Biol. Chem. (1991), 
266:1858-1865). Extracts were concentrated by ammonium 
sulfate precipitation and desalting using P-6 columns (Bio- 

10 Rad, Hercules, CA) . Reactions {65|Xl) contained 0,1M 

Tris/HCl (pH 8.0), 1 mM dithiothreitol , 25 mM recombinant 
spinach ACPI, 1 mM NADH, 2 mM NADPH, 50 |1M malonyl-CoA, 10 
\m [l-"C]acetyl-CoA (50 mCi/mmol) , Img/ml BSA, and 0.25 
mg/ml seed protein. Selected seed extracts were 

15 preincubated with cerulenin at 23**C for 10 min. Reaction 
products were separated on an 18% acrlamide gel containing 
2.25M urea, electroblotted onto to nitrocellulose and 
quntitated by phospor imaging using Image QuaNT software 
(Molecular Dynamics, Sunnyvale, CA) . Authentic acyl-ACPs 

20 were run in parallel, immunoblotted and finally detected by 
anti-ACP serum to confirm fatty acid chain lengths. 

The results (Figure 22) indicate that the fatty acid 
synthesis capabilities of transgenic Brasica (5401-9) seed 
extracts was greater than that obtained from in the 

25 nontransgenic controls as measured by the relative abundance 
of C8:0- and C10:0-ACP at all time points tested. In 
addition, pretreatment of the extracts with cerulenin, 
markedly reduced the synthesis of longer chain fatty acids 
in both the transgenic and nontransgenic control seed 
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extracts. However, the extension of the spinach-ACP was much 
less inhibited in the seed extracts from the transgenic 
lines than in the seed extracts of nontransgenic control 
Brassica. 

5 These data further support that Ch KAS A-2-7 is a 

condensing enzyme active on medium chain acyl-ACPs, and that 
expression of this enzyme in plants results in enlarged 
substrate pools to be .hydrolyzed by medium-chain specific 
thioesterases - Furthermore, these data suggest that chKAS 

10 A-2-7 also is a cerulenin-resistant condensing enzyme - 

All publications and patent applications mentioned in 
this specification are indicative of the level of skill of 
those skilled in the art to which this invention pertains. 
15 All publications and patent applications are herein 

incorporated by reference to the same extent as if each 
individual publication or patent application was 
specifically and individually indicated to be incorporated 
by reference. 

20 Although the foregoing invention has been described in 

some detail by way of illustration and example for purposes 
of clarity of understanding, it will be obvious that certain 
changes and modifications may be practiced within the scope 
of the appended claim. 
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13 . The construct of Claim 5 wherein said encoding 
sequence is cpuKAS A/p8-9A. 

14. The construct of Claim 5 wherein said encoding 
sequence is chKASIII-27 . 
5 15. An improved method for producing medium-chain fatty 

acids in transgenic plant seeds by expression of a plant 
medium-chain thioesterase protein heterologous to said 
transgenic plant, 

the improvement comprising expression of a plant synthase 
10 factor protein heterologous to said transgenic plant in 
conjunction with expression of said plant medium-chain 
thioesterase, whereby the percentage of medium-chain fatty 
acids produced in seeds expressing both a plant synthase factor 
protein and a plant medium-chain thioesterase protein is 
15 increased as compared to the percentage of medium-chain fatty 
acids produced in seeds expressing only said plant medixam-chain 
thioesterase protein. 

16. The method of Claim 15 wherein said medium-chain 
thioesterase protein is a ChFatB2 protein . 
20 17 . The method of Claim 15 wherein said medium-chain 

thioesterase protein is a CpFatBl protein. 

18, The method of Claim 15 wherein said medium-chain 
thioesterase protein is a C12 preferring thioesterase from 
California bay. 

25 19 . The method of Claim 15 wherein said plant synthase 

factor protein is expressed from a construct according to Claim 
1. 

20. The method of Claim 19 wherein said synthase factor A 
protein is from a Cuphea species. 
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21. The method of Claim 20 wherein said Cuphea species is 
C. hookeriana or C. pullcherrima. 

22. A method of altering the mediiim-chain fatty acid 
composition in plant seeds expressing a heterologous plant 

5 medium-chain preferring thioesterase, wherein said method 
comprises 

providing for expression of a plant synthase factor 
protein heterologous to said transgenic plant in conjunction 
with expression of a plant medium-chain thioesterase protein 

10 heterologous to said transgenic plant, whereby the composition 
of medium-chain fatty acids produced in said seeds is modified 
as compared to the composition of medium-chain fatty acids 
produced in seeds expressing said plant medium-chain 
thioesterase protein in the absence of expression of said plant 

15 synthase factor protein, 

23. The method of Claim 22 wherein said medium-chain 
thioesterase protein is a ChFatB2 protein. 

24. The method of Claim 22 wherein said medium-chain 
thioesterase protein is a CpFatBl protein. 

20 25. The method of Claim 22 wherein said medium-chain 

thioesterase protein is a C12 preferring thioesterase from 
California bay. 

26. The method of Claim 22 wherein said plant synthase 
factor protein is expressed from a construct according to Claim 

25 1. 

27. The method of Claim 2 6 wherein said synthase factor A 
protein is from a Cuphea species . 

28. The method of Claim 27 wherein said Cuphea species is 
C. hookeriana or C. pullcherrima. 
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29. The method of Claim 22 wherein said fatty acid 
composition is enriched for CIO fatty acids. 

30. The method of Claim 22 wherein said fatty acid 
composition is enriched for C12 fatty acids. 

31. The method of Claim 22 wherein said fatty acid 
composition is enriched for at least one medium chain fatty 
acid and at least one other medium chain fatty acid is 
decreased. 

32- The method of Claim 31 wherein said enriched fatty 
acid is C12 and said decreased fatty acid is C14 , 
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METHODS AND COMPOSITIONS FOR SYNTHESIS OF 
LONG CHAIN POLYUNSATURATED FATTY ACIDS 

RELATED AP PT.ir ATIONS 

This application is a contiuadon-in-part application of United States 
Patent Application Serial No- 08/834,655 filed April 1 1, 1997. 

INTRODUCTION 

yield of the Invention 

This invention relates to modulating levels of enzymes and/or enzyme 
components relating to production of long chain poly-\msaturated fatty acids 
(PUFAs) in a microorganism or animal. 

Background 

Two main families of polyunsaturated Catty acids (PUFAs) are the <o3 
fatty acids, exemplified by eicosapentaenoic acid (EPA), and the cd6 fatty acids, 
exemplified by arachidonic acid (ARA). PUFAs are important components of 
the plasma membrane of the cell, where they may be found in such forms as 
phospholipids. PUFAs are necessary for proper development, particularly in the 
developing infant brain, and for tissue formation and repair. PUFAs also serve 
as precursors to other molecules of importance in human beings and animals, 
including the prostacyclins, eicosanoids, leukotrienes and prostaglandins. Four 
major long chain PUFAs of importance include docosahexaenoic acid (DHA) 
and EPA, which are primarily found in different types of fish oil, y-linolenic 
acid (GLA), which is found in the seeds of a number of plants, including 
evening primrose (Oenothera biennis), borage (Borago officinalis) and black 
currants {Ribes nigrum), and stearidonic acid (SDA), which is found in marine 
oils and plant seeds. Bottx GLA and another important long chain PUFA, 
arachidonic acid (ARA), are found in filamentous fimgi. ARA can be purified 
fi-om animal tissues including liver and adrenal gland. GLA, ARA, EPA and 
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SDA are themselves, or are dietary precursors to, important long chain fatty 
acids involved in prostaglandin synthesis, in treatment of heart disease, and in 
development of brain tissue. 

For DHA, a number of sources exist for commercial production 
S including a variety of marine organisms, oils obtained from cold water marine 

fish, and egg yolk fractions. For ARA, microorganisms including the genera 
Mortierella, Entomophthora, Phytium and Porphyridium can be used for 
commercial production. Commercial sources of SDA include the genera 
Trichodesma and Echium, Conunercial sources of GLA include evening 

10 primrose, black currants and borage. However, there are several disadvantages 

associated with commercial production of PUFAs from natural sources. Natural 
sources of PUFAs, such as animals and plants, tend to have highly 
heterogeneous oil compositions. The oils obtained from these sources therefore 
can require extensive purification to separate out one or more desired PUFAs or 

IS to produce an oil which is enriched in one or more PUF A. Natural sources also 

are subject to uncontrollable fluctuations in availability. Fish stocks may 
undergo natural variation or may be depleted by overfishing. Fish oils have 
unpleasant tastes and odors, which may be impossible to economically separate 
from the desired product, and can render such products unacceptable as food 

20 supplements. Aninud oils, and particularly fish oils, can accumulate 

environmental pollutants. Weather and disease can cause fluctuation in yields 
from both fish and plant sources. Cropland available for production of alternate 
oil-producing crops is subject to competition from the steady expansion of 
human populations and the associated increased need for food production on the 

25 remaining arable land. Crops which do produce PUFAs, such as borage, have 

not been adapted to commercial growth and may not perform well in 
monoculture. Growth of such crops is thus not economically competitive where 
more profitable and better established crops can be grown. Large scale 
fermentation of organisms such as Mortierella is also exprasive. Natural 

30 animal tissues contain low amounts of ARA and are difficult to process. 

Microorganisms such as Porphyridium and Mortierella are difficult to cultivate 
on a commercial scale. 
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Dietary supplements and pharmaceutical formulations containing 
PUFAs can retain the disadvantages of the PUFA source. Supplements such as 
fish oil capsules can contain low levels of the particular desired component and 
thus require large dosages. High dosages result in ingestion of high levels of 

5 undesired components, including contaminants. Unpleasant tastes and odors of 

the supplements can make such regimens undesirable, and may inhibit 
compliance by the patient. Care must be taken in providing fatty acid 
supplements, as overaddition may result in suppression of endogenous 
biosynthetic pathways and lead to competition with other necessary fatty acids 

1 0 in various lipid firactions in vivo, leading to undesirable results. For example, 

Eskimos having a diet high in (0)3 fatty acids have an increased tendency to 
bleed (U.S. Pat No. 4,874,603). 

A number of enzymes are involved in PUFA biosynthesis. Linoleic acid 
(LA, 18:2 A9, 12) is produced fix)m oleic acid (18:1 A9) by a A12-desaturase. 

15 GLA (18:3 A6, 9, 12) is produced from linoleic acid (LA, 18:2 A9, 12) by a A6- 

desaturase. ARA (20:4 A5, 8, 11, 14) production from dihomo-Y-linolenic acid 
(DGLA, 20:3 A8, 11, 14) is catalyzed by a A5-desaturase. However, animals 
cannot desaturate beyond the A9 position and therefore cannot convert oleic 
acid (18:1 A9) into linoleic acid (18:2 A9, 12). Likewise, a-linolenic acid 

20 (ALA, 18:3 A9, 12, 15) cannot be synthesized by mammals. Other eukaryotes, 

including fungi and plants, have enzymes which desaturate at positions A12 and 
A15. The major poly-imsaturated fatty acids of animals therefore are either 
derived from diet and/or from desaturation and elongation of linoleic acid (18:2 
A9, 12) or oc-linolenic acid (18:3 A9, 12, 15). Therefore it is of interest to obtain 

25 genetic material involved in PUFA biosynthesis from species that naturally 

produce these fatty acids and to express the isolated material in a microbial or 
animal system which can be manipulated to provide production of conmiercial 
quantities of one or more PUFAs. Thus there is a need for fatty acid 
desaturases, genes encoding them, and recombinant methods of producing them. 

30 A need further exists for oils containing higher relative proportions of and/or 
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enriched in specific PUFAs. A need also exists for reliable economical methods 
of producing specific PUFAs. 

Relevant Literature 

Production of Y-linolenic acid by a A6-desaturase is described in USPN 
5 5,552,306. Production of 8, 1 1-eicosadienoic acid using Mortierella alpina is 

disclosed in USPN 5,376,541 . Production of docosahexaenoic acid by 
dinoflagellates is described in USPN 5,407,957. Cloning of a A6-palmitoyl- 
acyl carrier protein desaturase is described in PCT publication WO 96/13591 
and USPN 5,614,400. Cloning of a A6-desaturase firom borage is described in 

1 0 PCT publication WO 96/2 1 022. Cloning of A9-desaturases is described in the 
published patent applications PCT WO 91/13972. EP 0 550 162 Al, EP 0 561 
569 A2, EP 0 644 263 A2, and EP 0 736 598 Al, and in USPN 5,057,419. 
Cloning of A12-desaturases fit>m various organisms is described in PCT 
publication WO 94/1 1516 and USPN 5,443,974. Cloning of A15-desatuiases 

15 from various organisms is described in PCT publication WO 93/1 1245. All 

publications and U.S. patents or implications referred to herein are hereby 
incorporated in their entirety by reference. 

SUMMARY OF THE INVENTION 

Novel compositions and methods are provided for preparation of poly- 
20 unsaturated long chain &tty acids. The compositions include nucleic acid 

encoding a A6- and A12- desaturase and/or polypeptides having A6- and/or A12- 
desaturase activity, the polypeptides, and probes isolating and detecting the 
same. The methods involve growing a host microorganism or animal 
expressing an introduced gene or genes encoding at least one desaturase, 
25 particularly a A6-» A9-, A12- or A15-desaturase. The methods also involve the 

use of antisense constructs or gene disruptions to decrease or eliminate the 
expression level of undesired desaturases. Regulation of expression of the 
desaturase poiypeptide(s) provides for a relative increase in desired desaturated 
PUFAs as a result of altered concentrations of enzjrmes and substrates involved 
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in PUFA biosynthesis. The invention finds use, for example, in the large scale 
production of GLA, DGLA, ARA, EPA, DHA and SDA. 

In a preferred embodiment of the invention, an isolated nucleic acid 
comprising: a nucleotide sequence depicted in Figure 3 A-E (SEQIDNO: l)or 
5 Figure 5A-D (SEQ ID NO: 3), a polypeptide encoded by a nucleotide sequence 

according Figure 3A-E (SEQ ID NO: 1) or Figure 5A-D (SEQ ID NO: 3). and 
a purified or isolated polypeptide comprising an amino acid sequence depicted 
in Figure 3 A-E (SEQ ID NO: 2) or Figure 5 A-D (SEQ ID NO: 4). In another 
embodiment of the invention, provided is an isolated nucleic acid encoding a 
1 0 polypeptide having an amino acid sequence depicted in Figure 3 A-E (SEQ ID 

NO: 2) or Figure 5A-D (SEQ ID NO: 4). 

Also provided is an isolated nucleic acid comprising a nucleotide 
sequence which encodes a polypeptide which desaturates a fatty acid molecule 
at carbon 6 or 12 from the carboxyl end, wherein said nucleotide sequence has 
15 an average A/T content of less than about 60%. In a preferred embodiment, the 

isolated nucleic acid is derived from a fungus, such as a fungus of the genus 
Mortierella. More preferred is a fungus of the species Mortierella alpina. 

In another preferred embodiment of the invention, an isolated nucleic 
acid is provided wherein the nucleotide sequence of the nucleic acid is depicted 
20 in Figure 3 A-E (SEQ ID NO: 1) or Figure 5 A-D (SEQ ID NO: 3). The 

invention also provides an isolated or purified polypeptide which desaturates a 
fatty acid molecule at carbon 6 or 12 from the carboxyl end, wherein the 
polypeptide is a eukaryotic polypeptide or is derived from a eukaryotic 
polypeptide, where a preferred eukaryotic polypeptide is derived from a fungus. 

2S The present invention further includes a nucleic acid sequence vdiich 

hybridizes to Figure 3A-E (SEQ ID NO: 1) or Figure 5A-D (SEQ ID NO: 3). 
Preferred is an isolated nucleic acid having a nucleotide sequence with at least 
about 50% homology to Figure 3A-E (SEQ ID NO: 1) or Figure 5A-D (SEQ 
ID NO: 3). The invention also includes an isolated nucleic acid having a 

30 nucleotide sequence with at least about 50% homology to Figure 3A-E (SEQ 
ID NO: 1) or Figure 5A-D (SEQ ID NO: 3). In a preferred embodiment, the 
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nucleic acid of the invention includes a nucleotide sequence which encodes an 
amino acid sequence depicted in Figure 3 A-D (SEQ ID NO: 2) which is 
selected from the group consisting of amino acid residues 50-53, 39-43, 172- 
176, 204-213, and 390-402. 

5 Also provided by the present invention is a nucleic acid construct 

comprising a nucleotide sequence depicted in a Figure 3A-E (SEQ ID NO: 1) 
or Figure 5 A-D (SEQ ID NO: 3) linked to a heterologous nucleic acid. In 
another embodiment, a nucleic acid construct is provided which comprises a 
nucleotide sequence depicted in a Figure 3A-E (SEQ ID NO: 1) or Figure 5A- 

10 D (SEQ ID NO: 3) operably associated with an expression control sequence 

functional in a host celL The host cell is either eukaryotic or prokaryotic. 
Preferred eukaiyotic host cells are those selected from the group consisting of a 
mammalian cell, an insect cell, a fungal cell, and an algae cell. Preferred 
mammalian cells include an avian cell, a preferred fungal cell includes a yeast 

1 5 cell, and a preferred algae cell is a marine algae cell. Preferred prokaryotic cells 

include those selected from the group consisting of a bacteria, a cyanobacteria, 
cells which contain a bacteriophage, and/or a virus. The DNA sequence of the 
. recombinant host cell preferably contains a promoter which is functional in the 
host cell, which promoter is preferably inducible. In a more preferred 

20 embodiment, the microbial cell is a fungal cell of the genus Mortierella, with a 

more preferred fungus is of the species Mortierella alpina. 

In addition, the present invention provides a nucleic acid construct 
comprising a nucleotide sequence which encodes a polypeptide comprising an 
amino acid sequence which corresponds to or is complementary to an amino 

25 acid sequence depicted in Figure 3 A-E (SEQ ID NO: 2) or Figure 5A-D (SEQ 

ID NO: 4), wherein the nucleic acid is operably associated with an expression 
control sequence fimctionsil in a microbial cell, wherein the nucleotide sequence 
encodes a functionally active polypeptide which desaturates a fatty acid 
molecule at carbon 6 or carbon 12 from the carboxyl end of a fatty acid 

30 molecule. Another embodiment of the present invention is a nucleic acid 

construct comprising a nucleotide sequence which encodes a functionally 
active A6-desaturase having an amino acid sequence which corresponds to or is 
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complementary to all of or a portion of an amino acid sequence depicted in a 
Figure 3A-E (SEQ ID NO: 2), wherein the nucleotide sequence is operably 
associated with a transcription control sequence functional in a host cell. 

Yet another embodiment of the present invention is a nticleic acid 
5 constmct comprising a nucleotide sequence which encodes a functionally 

active A12-desaturase having an amino acid sequence which corresponds to or 
is complementary to all of or a portion of an amino acid sequence depicted in a 
Figure 5A-D (SEQ ID NO: 4), wherein the nucleotide sequence is operably 
associated with a transcription control sequence functional in a host cell. The 

1 0 host cell, is either a eukaryotic or prokaryotic host cell. Preferred eukaryotic 

host cells are those selected fix>m the group consisting of a manamalian cell, an 
insect cell, a fungal cell, and an algae cell. Preferred mammalian cells include 
an avian cell, a preferred fungal cell includes a yeast cell, and a preferred algae 
cell is a marine algae cell. Preferred prokaryotic cells include those selected 

IS from the group consisting of a bacteria, a cyanobacteria, cells which contain a 

bacteriophage, and/or a virus. The DNA sequence of the recombinant host cell 
preferably contains a promoter which is functional in the host cell and which 
preferably is inducible. A preferred recombinant host cell is a microbial cell 
such as a yeast cell, such as a Saccharomyces cell. 

20 The present invention also provides a recombinant microbial cell 

comprising at least one copy of a nucleic acid which encodes a functionally 
active Mortierella alpina fatty acid desaturase having an amino acid sequence 
as depicted in Figure 3A-E (SEQ ID NO: 2), wherein the cell or a parent of the 
cell was transformed with a vector comprising said DNA sequence, and wherein 

25 the DNA sequence is operably associated with an expression control sequence. 

In a preferred embodiment, the cell is a microbial cell which is enriched in 18:2 
fatty acids, particularly where the microbial cell is from a genxis selected from 
the group consisting of a prokaryotic cell and eukaryotic cell. In another 
preferred embodiment, the microbial cell according to the invention includes an 

30 expression control sequence which is endogenous to the microbial cell. 
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Also provided by the present invention is a method for production of 
GLA in a host cell, where the method comprises growing a host culture having 
a plurality of host cells which contain one or more nucleic acids encoding a 
polypeptide which converts LA to GLA, wherein said one or more nucleic acids 
5 is operably associated with an expression control sequence, under conditions 

whereby said one or more nucleic acids are expressed, whereby GLA is 
produced in the host cell. In several preferred embodiments of the methods, the 
polypeptide employed m the method is a functionally active enzyme which 
desaturates a fatty acid molecule at carbon 6 from the carboxyl end of a fatty 
10 acid molecule; the said one or more nucleic acids is derived from a Mortierella 
alpina; the substrate for the polypeptide is exogenously supplied; the host cells 
are microbial cells; the microbial cells are yeast cells, such as Saccharomyces 
cells; and the growing conditions are inducible. 

Also provided is an oil comprising one or more PUFA, wherein the 
1 S amoimt of said one or more PUFAs is approximately 0.3-30% arachidonic acid 

(ARA), approximately 0.2-30% dihomo-y-linolenic acid (DGLA), and 
approximately 0.2-30% y-linoleic acid (GLA). A preferred oil of the invention 
is one in viiich the ratio of ARA:DGLA:GLA is approximately 1 .0: 19.0:30 to 
6.0:1.0:0.2. Another preferred embodiment of the invention is a pharmaceutical 
20 composition comprising the oils in a pharmaceutically acceptable carrier. 

Further provided is a nutritional composition comprising the oils of the 
invention. The nutritional compositions of the invention preferably are 
administered to a mammalian host parehterally or internally. A preferred 
composition of the invention for internal consumption is an infant formula. In a 
25 preferred embodiment, the nutritional compositions of the invention are in a 

liquid form or a solid form, and can be formulated in or as a dietary supplement, 
and the oils provided in encapsulated form. The oils of the invention can be 
free of particular components of other oils and can be derived from a microbial 
cell, such as a yeast cell. 

30 The present invention further provides a method for desaturating a fatty 

acid. In a preferred embodiment the method comprises culturing a recombinant 
microbial cell according to the invention under conditions suitable for 

-8- 
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expression of a polypeptide encoded by said nucleic acid, wherein the host ceil 
further comprises a fatty acid substrate of said polypeptide. Also provided is a 
fatty acid desatuiated by such a method, and an oil composition comprising a 
fatty acid produced according to the methods of the invention. 

S The present invention further includes a purified nucleotide sequence or 

polypeptide sequence that is substantially related or homologous to the 
nucleotide and peptide sequences presented in SEQ ID NO:l - SEQ ID NO:40. 
The present invention is further directed to methods of using the sequences 
presented in SEQ ID NO:l to SEQ ID NO:40 as probes to identify related 
10 sequences, as components of expression systems and as components of systems 

useful for producing transgenic oil. 

The present invention is further directed to formulas, dietary 
supplements or dietary supplements in the form of a liquid or a solid containing 
the long chain fatty acids of the invention. These formulas and supplements 
1 5 may be administered to a human or an animal. 

The formulas and supplements of the invention may further comprise at 
least one macronutrient selected from the group consisting of coconut oil, soy 
oil, canola oil, mono- and diglycerides, glucose, edible lactose, electrodialysed 
whey, electrodialysed skim milk, milk whey, soy protein, and other protein 
20 hydrolysates. 

The formulas of the present invention may further include at least one 
vitamin selected from the group consisting of Vitamins A, C, D, E, and B 
complex; and at least one mineral selected from the group consisting of 
calcium, m^nesium, zinc, manganese, sodium, potassium, phosphorus, copper, 
25 chloride, iodine, selenium, and iron. 

The present invention is further directed to a method of treating a patient 
having a condition caused by insuffient intake or production of polyunsaturated 
fatty acids comprising administering to the patient a dietary substitute of the 
invention in an amount sufficient to effect treatment of the patient. 

30 The present invention is further directed to cosmetic and pharmaceutical 

compositions of the material of the invention. 
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The present invention is further directed to transgenic oils in 
pharmaceuticaliy acceptable carriers. The present invention is further directed 
to nutritional supplements, cosmetic agents and infant formulae containing 
transgenic oils. 

5 The present invention is further directed to a method for obtaining 

altered long chain poljomsaturated fatty acid biosyndiesis comprising the steps 
of: growing a microbe having cells which contain a transgene which encodes a 
transgene expression product which desaturates a fatty acid molecule at carbon 
6 or 12 from the carboxyl end of said fatty acid molecule, wherein the trangene 
10 is operably associated with an expression control sequence, imder conditions 

whereby the transgene is expressed, whereby long chain polyunsaturated fatty 
acid biosynthesis in the cells is altered. 

The present invention is further directed toward pharmaceutical 
compositions comprising at least one nutrient selected from the group consisting 
15 of a vitamin, a mineral, a carbohydrate, a sugar, an amino acid, a free fatty acid, 

a phospholipid, an antioxidant, and a phenolic compoimd. 



BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 shows possible pathways for the synthesis of arachidonic acid 
20 (20:4 A5, 8, 1 1, 14) and stearidonic acid (18:4 A6, 9, 12, 15) from palmitic acid 

(Ci6) from a variety of organisms, including algae, Moriierella and humans. 
These PUP As can serve as precursors to other molecules important for humans 
and other animals, including prostacyclins, leukotrienes, and prostaglandins, 
some of which are shown. 

25 Figure 2 shows possible pathways for production of PUF As in addition 

to ARA, including EPA and DHA, again compiled from a variety of organisms. 

Figure 3A-E shows the DNA sequence of the Mortierella alpina A6- 
desaturase and the deduced amino acid sequence: 

Figure 3 A-E (SEQ ID NO 1 A6 DESATURASE cDNA) 



-10- 
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Figure 3 A-E (SEQ ID NO 2 A6 DESATURASE AMINO ACID) 

Figure 4 shows an alignment of a portion of the Mortierella alpina A6- 
desaturase amino acid sequence with other related sequences. 

Figure 5A-D shows the DNA sequence of the Mortierella alpina A12- 
S desaturase and the deduced amino acid sequence: 

Figure SA-D (SEQ ID NO 3 A 12 DESATURASE cDNA) 

Figure 5A-D (SEQ ID NO 4 A12 DESATURASE AMINO ACID). 

Figures 6 A and 6B show the effect of different expression constructs on 
expression of GLA in yeast. 

10 Figures 7A and 7B show the effect of host strain on GLA production. 

Figures 8 A and 8B show the effect of temperature on GLA production in 
51 cerevisiae strain SC334. 

Figure 9 shows alignments of the protein sequence of the Ma 29 and 
contig 253538a. 

1 S Figure 1 0 shows alignments of the protein sequence of Ma 524 and 

contig 253538a. 

BRIEF DESCRIPTION OF THE SEQUEN CE LISTINGS 

SEQ ID NO:l shows the DNA sequence of the Mortierella alpina A6- 
desaturase. 

20 SEQ ID NO:2 shows the protein sequence of the Mortierella alpina A6- 

desaturase. 

SEQ ID NO:3 shows the DNA sequence of the Mortierella alpina A12- 
desaturase. 

SEQ ID NO:4 shows the protein sequence of the Mortierella alpina 
25 A12-desaturase. 

SEQ ID NO:5-l 1 show various desaturase sequences. 
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SEQ ID NO:13-18 show various PGR primer sequences. 

SEQ ID NO: 19 and SEQ ID NO:20 show the nucleotide and amino acid 
sequence of a Dictyostelium discotdeum desaturase. 

SEQ ID NO:21 and SEQ ID NO:22 show the nucleotide and amino acid 
5 sequence of a Phaeodactylum tricomutum desaturase. 

SEQ ID NO:23-26 show the nucleotide and deduced amino acid 
sequence of a Schizochytriim cDNA clone. 

SEQ ID NO: 27*33 show nucleotide sequences for hunum desaturases. 

SEQ ID NO:34 - SEQ ID NO:40 show peptide sequences for himian 
10 desaturases. 

DESCRIPTION OF THE PREFERRED EMBODIMENTS 

In order to ensure a complete understanding of the invention, the 
following definitions are provided: 

A5-Desaturase: AS desaturase is an en^me which introduces a double 
1 S bond between carbons S and 6 from the carboxyl end of a fatty acid molecule. 

A6-Desaturase: A6-desaturase is an enzyme which introduces a double 
bond between carbons 6 and 7 from the carboxyl end of a fatty acid molecule. 

A9-Desaturase: A9-desaturase is an enzyme which introduces a double 
bond between carbons 9 and 10 from the carboxyl end of a fatty acid molecule. 

20 A12-Desaturase: A12-desaturase is an enzyme vdiich introduces a 

double bond between carbons 12 and 13 from the carboxyl end of a fatty acid 
molecule. 

Fatty Acids: Fatty acids are a class of compounds containing a long 
hydrocarbon chain and a terminal carboxylate group. Fatty acids include the 
25 following: 



Fatty Acid 


12:0 


lauric acid 




16:0 


palmitic acid 
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Fatty Acid 


16:1 


palmitoleic acid 




18:0 


stearic acid 




18:1 


oleic acid 


A9-18:l 


18:2 A5,9 


taxoleic acid 


A5,9-18:2 


18:2 A6,9 


6,9-octadecadtenoic acid 


A6,9-18:2 


18:2 


Linolenic acid 


A9.12-18:2(LA) 


18:3 A6,9,12 


Gamma-linoienic acid 


A6,9,12-18:3 (GLA) 


18:3 A5,9,12 


Pinolenic acid 


A5,9,12-18:3 


18:3 


aipha-ltnoleic acid 


A9,12,15-18:3 (ALA) 


18:4 


stearidonic acid 


A6,9,12,15-18:4(SDA) 


20:0 


Arachidic acid 




ZU. 1 


cicoscenic Acio 






Dcnciiic ocia 










22:2 


docasadienoic acid 




20:4 <o6 


arachidonic acid 


A5.8, 11, 14-20:4 (ARA) 


20:3 GJ6 


<o6-eicosatrienoic 
dihomo-Baimna linolenic 


A8,l 1.14-20:3 (DGLA) 


20:5 <a3 


Eicosapentanoic 
(Timnodonic acid) 


A5.8,l 1,14,17-20:5 (EPA) 


20:3 co3 


a>3-eicosatrienoic 


Al 1,16,17-20:3 


20:4 a>3 


o>3-eicosatetraenoic 


A8,l 1,14,17-20:4 


22:5 0)3 


Docosapentaenoic 


A7,10,13,16,19-22:5 ((o3DPA) 


22:6 €03 


Docosahexaenoic 
(cervonic acid) 


A4,7,10,13,16,19-22:6 (DHA) 


24:0 


Lignoceric acid 





Taking into account these definitions, the present invention is directed to 
novel DNA sequences, DN A constructs, methods and compositions are 
provided which permit modification of the poly*unsaturated long chain fatty 
S acid content of, for example, microbial cells or animals. Host cells are 

manipulated to express a sense or antisense transcript of a DNA encoding a 
polypeptide(s) which catalyzes the desaturation of a fatty acid. The substrate(s) 
for the expressed enzyme may be produced by the host cell or may be 
exogenously supplied. To achieve expression, the transfomied DNA is 
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operably associated with transcriptional and tianslationai initiation and 
termination regulatory regions that are fimctional in the host cell. Constructs 
comprising the gene to be expressed can provide for integration into the genome 
of the host cell or can autonomously replicate in the host cell. For production of 
S linoleic acid (LA), the expression cassettes generally used include a cassette 

which provides for A12-desaturase activity, particularly in a host cell which 
produces or can take up oleic acid (U.S. Patent No. 5,443,974). Production of 
LA also can be increased by providing an expression cassette for a A9- 
desaturase v^ere that enzymatic activity is limiting. For production of ALA, 

10 the expression cassettes generally used include a cassette which provides for 

AIS- or G>3-desaturase activity, particularly in a host cell ^^ch produces or can 
take up LA. For production of GLA or SDA, the expression cassettes generally 
used include a cassette which provides for A6-desaturase activity, particularly in 
a host cell \s4iich produces or can take up LA or ALA, respectively. Production 

15 of (D6-type unsaturated &tty acids, such as LA or GLA, is favored in a host 

microorganism or animal which is incapable of producing ALA. The host ALA 
production can be removed, reduced and/or inhibited by inhibiting the activity 
. ofaA15-ora>3-t3rpedesaturase(seeFigure2). This can be accomplished by 
standard selection, providing an expression cassette for an antisense A15 or 03 

20 transcript, by disrupting a target Al 5- or G)3-desaturase gene through insertion, 

deletion, substitution of part or all of the target gene, or by adding an inhibitor 
of A15- or c[>3-desaturase. Similarly, production of LA or ALA is favored in a 
microorganism or animal having A6-desaturase activity by providing an 
expression cassette for an antisense A6 transcript, by disrupting a A6-desaturase 

25 gene, or by use of a A6-desaturase inhibitor. 

MICROBIAL PRODUCTION OF FATTY ACIDS 

Microbial production of fatty acids has several advantages over 
purification from natural sources such as fish or plants. Many microbes are 
known with greatly simplified oil compositions compared with those of higher 
30 organisms, making purification of desired components easier. Microbial 

production is not subject to fluctuations caused by extemal variables such as 
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weather and food supply. Microbially produced oil is substantially free of 
contamination by environmental pollutants. Additionally, microbes can provide 
PUFAs in particular forms which may have specific uses. For example, 
Spirulina can provide PUFAs predominantiy at the first and third positions of 

5 triglycerides; digestion by pancreatic lipases preferentially releases fatty acids 

fix>m these positions. Following human or animal ingestion of triglycerides 
derived from Spirulina^ these PUFAs are released by pancreatic lipases as free 
fatty acids and thus are directly available, for example, for infant brain 
development. Additionally, microbial oil production can be manipulated by 

10 controlling culture conditions, notably by providing particular substrates for 
microbially expressed enzjrmes, or by addition of compovmds which siq>press 
undesired biochemical pathways. In addition to these advantages, production of 
fotty acids from recombinant microbes provides the ability to alter the naturally 
occurring microbial fatty acid profile by providing new synthetic pathways in 

IS the host or by suppressing undesired pathways, thereby increasing levels of 

desired PUFAs, or conjugated forms thereof, and decreasing levels of undesired 
PUFAs. 

PRODUCTION OF FATTY ACmS IN ANIMALS 

Production of fatty acids in animals also presents several advantages. 

20 Expression of desaturase genes in animals can produce greatly increased levels 

of desired PUFAs in animal tissues, making recovery from those tissues more 
economical. For example, where the desired PUFAs are expressed in the breast 
milk of animals, methods of isolating PUFAs from animal milk are well 
established. In addition to providing a source for purification of desired 

25 PUFAs, animal breast milk can be manipulated through expression of 

desaturase genes, either alone or in combination with other human genes, to 
provide animal milks substantially similar to human breast milk during the 
different stages of infant development. Humanized animal milks could serve as 
infant formulas where human nursing is impossible or undesired, or in cases of 

30 malnourishment or disease. 

Depending upon the host cell, the availability of substrate, and the 
desired end product(s), several polypeptides, particularly desaturases, are of 
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interest. By *'desaturase" is intended a polypeptide which can desaturate one or 
more fatty acids to produce a mono- or poly-unsaturated fatty acid or precursor 
thereof of interest. Of particular interest are polypeptides which can catalyze 
the conversion of stearic acid to oleic acid, of oleic acid to LA, of LA to ALA» 
S of LA to GLA, or of ALA to SDA, which includes enzymes v^ch desaturate at 

theA9, A12, ((d6X AlS,((o3)or A6 positions. By "polypeptide" is meant any 
chain of amino acids, regardless of length or post-translational modification, for 
example, glycosylation or phosphorylation. Considerations for choosing a 
specific polypeptide having desaturase activity include the pH optimum of the 

1 0 polypeptide, whether the polypeptide is a rate limiting enscyme or a component 
thereof, whether the desaturase used is essential for synthesis of a desired poly- 
unsaturated fatty acid, and/or co-factors required by the polypeptide. The 
expressed polypeptide preferably has parameters compatible with the 
biochemical envirormient of its location in the host ceil. For example, the 

1 S polypeptide may have to compete for substrate with other enzymes in the host 

cell. Analyses of the Km and specific activity of the polypeptide in question 
therefore are considered in determining the suitability of a given polypeptide for 
modifying PUFA production in a given host cell. The polypeptide used in a 
particular situation is one which can function imder the conditions present in the 

20 intended host cell but otherwise can be any polypeptide having desaturase 

activity which has the desired characteristic of being capable of modifying the 
relative production of a desired PUP A. 

For production of linoleic acid from oleic acid, the DNA sequence used 
encodes a polypeptide having A12-desaturase activity. For production of GLA 

25 from linoleic acid, the DNA sequence used encodes a polypeptide having A6- 

desaturase activity. In particular instances, expression of A6-desaturase activity 
can be coupled with expression of A12-desaturase activity and the host cell can 
optionally be depleted of any Al 5-desaturase activity present, for example by 
providing a transcription cassette for production of antisense sequences to the 

30 Al 5-desaturase transcription product, by disrupting the A 1 5-desaturase gene, or 

by using a host cell which naturally has, or has been mutated to have, low A15- 
desaturase activity. Inhibition of imdesired desaturase pathways also can be 
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accomplished through the use of specific desaturase inhibitors such as those 
described in U.S. Patent No. 4,778,630. Also, a host cell for A6-desaturase 
expression may have, or have been mutated to have, high A12-desaturase 
activity. The choice of combination of cassettes used depends in part on the 
S PUFA profile and/or desaturase profile of the host cell. Where the host cell 

expresses A12-desaturase activity and lacks or is depleted in AlS-desaturase 
activity, overexpression of A6-desaturase alone generally is sufficient to provide 
for enhanced GLA production. Where the host cell expresses A9-desaturase 
activity, expression of a A12- and a A6-desaturase can provide for enhanced 

1 0 GLA production. When A9-desaturase activity is absent or limiting, an 

expression cassette for A9-desaturase can be used. A scheme for the synthesis 
of arachidonic acid (20:4 A^' bom stearic acid (18:0) is shown in Figure 

2. A key enzyme in this pathway is a A6-desaturase which converts the linoleic 
acid into y-linolenic acid. Conversion of a-linolenic acid (ALA) to stearidonic 

1 5 acid by a A6-desaturase also is shown. 

SOURCES OF POLYPEPTIDES 
HAVING DESATURASE ACTIVITY 

A source of polypeptides having desaturase activity and oligonticleotides 
encoding such polypeptides are organisms which produce a desired poly- 

20 unsaturated fatty acid. As an example, microorganisms having an ability to 
produce GLA or ARA can be used as a source of A6- or A12- desaturase 
activity. Such microorganisms include, for example, those belonging to the 
genera Mortterella^ Conidioholus, Pythium, Phytophathora, Penicillium, 
Porphyridium, Coidosporium, Mucor, Fusarium, Aspergillus, Rhodotorula, and 

25 Entomophthora, Within the genus Porphyridium, of particular interest is 

Porphyridium cruentum. Within the genus Mortierella, of particular interest are 
Mortierella elongata, Mortierella exigua, Mortierella hygrophila, Mortlerella 
ramamtiana^ var. angulispora, and Mortierella alpina. l^thin the genus Mucor^ 
of particular interest are Mucor circinelloides and Mucor javanicus. 

30 DNAs encoding desired desaturases can be identified in a variety of 

ways. As an example, a source of the desired desaturase, for example genomic 
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or cDNA libraries from Mortierella^ is screened with detectable enzymatically- 
or chemically-synthesized probes, which can be made from DNA, RNA, or non- 
naturally occurring nucleotides, or mixtures thereof. Probes may be 
enzymatically synthesized from DNAs of known desaturases for normal or 
S reduced-stringency hybridization methods. Oligonucleotide probes also can be 

used to screen sources and can be based on sequences of known desaturases, 
including sequences conserved among known desaturases, or on peptide 
sequences obtained from the desired purified protein. Oligonucleotide probes 
based on amino acid sequences can be degenerate to encompass the degeneracy 

10 of the genetic code, or can be biased in favor of the preferred codons of the 

source organism. Oligonucleotides also can be used as primers for PGR from 
reverse transcribed mRNA from a known or suspected source; the PGR product 
can be the frill length cDNA or can be used to generate a probe to obtain the 
desired frill length cDNA. Alternatively, a desired protein can be entirely 

1 S sequenced and total synthesis of a DNA encoding that polypeptide perfomied. 

Once the desired genomic or cDNA has been isolated, it can be 
sequenced by known methods. It is recognized in the art that such methods are 
_ subject to errors, such ttiat multiple sequencing of the same region is routine and 
is still expected to lead to measurable rates of mistakes in the resulting deduced 

20 sequence, particularly in regions having repeated domains, extensive secondary 

structure, or unusual base compositions, such as regions with high GC base 
content When discrepancies arise, resequencing can be done and can employ 
special methods. Special methods can include altering sequencing conditions 
by using: different temperatures; different enzymes; proteins which alter the 

25 ability of oligonucleotides to form higher order structures; altered nucleotides 

such as ITP or methylated dGTP; different gel compositions, for example 
adding formamide; different primers or primers located at different distances 
from the problem region; or different templates such as single stranded DNAs. 
Sequencing of mRNA also can be employed. 

30 For the most part, some or all of the coding sequence for the polypeptide 

having desaturase activity is from a natural source. In some situations, 
however, it is desirable to modify all or a portion of the codons, for example, to 
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enhance expression, by employing host preferred codons. Host preferred 
codons can be determined from the codons of highest frequency in the proteins 
expressed in the largest amount in a particular host species of interest. Thus, the 
coding sequence for a polypeptide having desaturase activity can be 
5 synthesized in whole or in part. All or portions of the DNA also can be 

synthesized to remove any destabilizing sequences or regions of secondary 
structure which would be present in the transcribed mRNA. All or portions of 
the DNA also can be synthesized to alter the base composition to one more 
preferable in the desired host cell. Methods for synthesizing sequences and 

10 bringing sequences together are well established in the literature. In vitro 

mutagenesis and selection, site-directed mutagenesis, or other means can be 
employed to obtain mutations of naturally occurring desaturase genes to 
produce a polypeptide having desaturase activity in vivo with more desirable 
physical and kinetic parameters for function in the host cell, such as a longer 

1 S half-life or a higher rate of production of a desired polyunsaturated fatty acid. 

Mortieralla alpina Desaturase 

Of particular interest is the Mortierella alpina A6-desaturase, which has 
457 amino acids and a predicted molecular weight of 51.8 kD; the amino acid 
sequence is shown in Figure 3. The gene encoding the Mortierella alpina A6- 

20 desaturase can be expressed in transgenic microorganisms or animals to effect 

greater synthesis of GLA from linoleic acid or of stearidonic acid from ALA. 
Other DNAs which are substantially identical to the Mortierella alpina A6- 
desaturase DNA, or which encode polypeptides which are substantially identical 
to the Mortierella alpina A6-desaturase polypeptide, also can be used. By 

25 substantially identical is intended an amino acid sequence or nucleic acid 

sequence exhibiting in order of increasing preference at least 60%, 80%, 90% or 
95% homology to the Mortierella alpina A6-desaturase amino acid sequence or 
nucleic acid sequence encoding the amino acid sequence. For polypeptides, the 
length of comparison sequences generally is at least 16 amino acids, preferably 

30 at least 20 amino acids, or most preferably 35 amino acids. For nucleic acids, 

the length of comparison sequences generally is at least 50 nucleotides, 
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preferably at least 60 nucleotides, and more preferably at least 75 nucleotides, 
and most preferably, 110 nucleotides. Homology typically is measured using 
sequence analysis software, for example, the Sequence Analysis software 
package of the Genetics Computer Group, University of Wisconsin 
S Biotechnology Center, 1710 University Avenue, Madison, Wisconsin 5370S, 

MEGAlign (DNAStar, Inc., 1228 S. Park St., Madison, Wisconsin S371S), and 
MacVector (Oxford Molecular Group, 2105 S. Bascom Avenue, Suite 200, 
Campbell, California 95008). Such software matches similar sequences by 
assigning degrees of homology to various substitutions, deletions, and other 

10 modifications. Conservative substitutions typically include substitutions within 
the following groups: glycine and alanine;. valine, isoleucine and leucine; 
aspartic acid, glutamic acid, asparagine, and glutamine; serine and threonine; 
lysine and arginine; and phenylalanine and tyrosine. Substitutions may also be 
made on the basis of conserved hydrophobicity or hydrophilicity (Kyte and 

1 5 Doolittle, J. MoL Biol 157: 105-132, 1982), or on the basis of the ability to 

assume similar polypeptide secondary structure (Chou and Fasman, Adv. 
EnzymoL 47: 45-148, 1978). 

Also of interest is the Mortierella alpina A12-desaturase, the nucleotide 
and amino acid sequence of which is shown in Figure 5. The gene encoding the 
20 Mortierella alpina 2M2-desaturase can be expressed in transgenic 

microorganisms or animals to effect greater synthesis of LA from oleic acid. 
Other DNAs which are substantially identical to the Mortierella alpina A12- 
desaturase DNA, or >^ch encode polypeptides which are substantially identical 
to the Mortierella alpina A12-desaturase polypeptide, also can be used. 

25 Other Desaturasea 

Encompassed by the present invention are related desaturases from the 
same or other organisms. Such related desaturases include variants of the 
disclosed A6- or A12-desaturase naturally occurring within the same or different 
species of Mortierella^ as well as homologues of the disclosed A6« or A12- 
30 desatuiase from other species. Also included are desaturases which, although 
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not substantially identical to the Mortierella alpina A6- or A12-desaturase, 
desaturate a fatty acid molecule at carbon 6 or 12, respectively, from the 
carboxyl end of a fatty acid molecule, or at carbon 12 or 6 from the tmninal 
methyl carbon in an 1 8 carbon fatty acid molecule. Related desaturases can be 
S identified by their ability to function substantially the same as the disclosed 

desaturases; that is, are still able to effectively convert LA to GLA, ALA to 
SDA or oleic acid to LA. Related desaturases also can be identified by 
screening sequence databases for sequences homologous to the disclosed 
desaturases, by hybridization of a probe based on the disclosed desaturases to a 
1 0 library constructed from the source organism, or by RT-PCR using mRNA from 

the source organism and primers based on the disclosed desaturases. Such 
desaturases include those from humans, Dictyostelitun discoideum and 
Phaeodaciylim tricornum. 

The regions of a desaturase polypeptide important for desaturase -activity 

1 5 can be determined through routine mutagenesis, expression of the resulting 

mutant polypeptides and determination of their activities. Mutants may include 
deletions, insertions and point mutations, or combinations thereof. A typical 
functional analysis begins with deletion mutagenesis to determine the N- and C- 
terminal limits of the protein necessary for function, and then internal deletions, 

20 insertions or point mutants are made to further determine regions necessary for 

function. Other techniques such as cassette mutagenesis or total synthesis also 
can be used. Deletion mutagenesis is accomplished, for example, by using 
exonucleases to sequentially remove the S' or 3' coding regions. Kits are 
available for such techniques. After deletion, the coding region is completed by 

25 ligating oligonucleotides containing start or stop codons to the deleted coding 

region after S' or 3* deletion, respectively. Alternatively, oligonucleotides 
encoding start or stop codons are inserted into the coding region by a variety of 
methods including site-directed mutagenesis, mutagenic PGR or by ligation 
onto DNA digested at existing restriction sites. Internal deletions can similarly 

30 be made through a variety of methods including the use of existing restriction 
sites in the DNA, by use of mutagenic primers via site directed mutagenesis or 
mutagenic PGR. Insertions are made through methods such as linker-scanning 
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mutagenesis, site-directed mutagenesis or mutagenic PGR. Point mutations are 
made through techniques such as site-directed mutagenesis or mutagenic PGR. 

Chemical mutagenesis also can be used for identifying regions of a 
desaturase polypeptide important for activity. A mutated construct is expressed, 
S and the ability of the resulting altered protein to function as a desaturase is 

assayed. Such structure-function analysis can determine vMch regions may be 
deleted, which regions tolerate insertions, and which point mutations allow the 
mutant protein to function in substantially the same way as the native 
desaturase. All such mutant proteins and nucleotide sequences encoding them 
1 0 are within the scope of the present invention. 

EXPRESSION OF DESATURASE GENES 

Once the DNA encoding a desaturase polypeptide has been obtained, it 
is placed in a vector capable of replication in a host cell, or is propagated in 
vitro by means of techniques such as PGR or long PGR. Replicating vectors 

IS can include plasmids, phage, viruses, cosmids and the like. Desirable vectors 

include those useful for mutagenesis of the gene of interest or for expression of 
the gene of interest in host cells. The technique of long PGR has made in vitro 
propagation of large constructs possible, so that modifications to the gene of 
interest, such as mutagenesis or addition of expression signals, and propagation 

20 of the resulting constructs can occur entirely in vitro without the use of a 

replicating vector or a host cell. 

For expression of a desaturase polypeptide, functional transcriptional 
and translational initiation and termination regions are operably linked to the 
DNA encoding the desaturase polypeptide. Expression of the polypeptide 
25 coding region can take place in vitro or in a host cell. Transcriptional and 

translational initiation and termination regions are derived from a variety of 
nonexclusive sources, including the DNA to be expressed, genes known or 
suspected to be capable of expression in the desired system, expression vectors, 
chemical synthesis, or from an endogenous locus in a host cell. 
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Expression In Vitr 

In vitro expression can be accomplished, for example, by placing the 
coding region for the desaturase polypeptide in an expression vector designed 
for in vitro use and adding rabbit reticulocj^e lysate and cofactors; labeled 
5 amino acids can be incorporated if desired. Such in vitro expression vectors 

may provide some or all of the expression signals necessary in the system used. 
These methods are well known in the art and the components of the system are 
commercially available. The reaction mixture can then be assayed directly for 
the polypeptide, for example by determining its activity, or the synthesized 
10 polypeptide can be purified and then assayed. 

Expression In A Host Cell 

Expression in a host cell can be accomplished in a transient or stable 
fashion. Transient expression can occur from introduced constructs which 
contain expression signals functional in the host cell, but which constructs do 

15 not replicate and rarely integrate in the host cell, or where the host cell is not 

proliferating. Transient expression also can be accomplished by inducing the 
activity of a regulatable promoter operably linked to the gene of interest, 
although such inducible systems frequently exhibit a low basal level of 
expression. Stable expression can be achieved by introduction of a construct 

20 that can integrate into the host genome or that autonomously replicates in the 

host cell. Stable expression of the gene of interest can be selected for through 
the use of a selectable marker located on or transfected with the expression 
construct, followed by selection for cells expressing the marker. When stable 
expression results from integration, integration of constructs can occur 

25 randomly within the host genome or can be targeted through the use of 

constructs containing regions of homology with the host genome sufficient to 
target recombination with the host locus. Where constructs are targeted to an 
endogenous locus, all or some of the transcriptional and translational regulatory 
regions can be provided by the endogenous locus. 
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When increased expression of the desaturase polypeptide in the source 
organism is desired, several methods can be employed. Additional genes 
encoding the desaturase polypeptide can be introduced into the host organism. 
Expression fix>m the native desaturase locus also can be increased through 
S homologous recombination, for example by inserting a stronger promoter into 

the host genome to cause increased expression, by removing destabilizing 
sequences from either the mRNA or the encoded protein by deleting that 
information from the host genome, or by adding stabili2dng sequences to the 
mRNA (USPN 4,910,141). 

1 0 When it is desirable to express more than one different gene, appropriate 

regulatory regions and expression methods, introduced genes can be propagated 
in the host cell through use of replicating vectors or by integration into the host 
genome. Where two or more genes are expressed from separate replicating 
vectors, it is desirable that each vector has a different means of replication. 

1 S Each introduced construct, whether integrated or not, should have a different 

means of selection and should lack homology to the other constructs to maintain 
stable expression and prevent reassortment of elements among constructs. 
Judicious choices of regulatory regions, selection means and method of 
propagation of the introduced construct can be experimentally determined so 

20 that all introduced genes are expressed at the necessary levels to provide for 

synthesis of the desired products. 

As an example, where the host cell is a yeast, transcriptional and 
translational regions functional in yeast cells are provided, particularly from the 
host species. The transcriptional initiation regulatory regions can be obtained, 

25 for example from genes in the glycolytic pathway, such as alcohol 

dehydrogenase, glyceraldehyde-3 -phosphate dehydrogenase (GPD), 
phosphoglucoisomerase, phosphogiycerate kinase, etc. or regulatable genes 
such as acid phosphatase, lactase, metallothionein, glucoamylase, etc. Any one 
of a number of regulatory sequences can be used in a particular situation, 

30 depending upon whether constitutive or induced transcription is desired, the 

particular efiBciency of the promoter in conjimction with the open-reading frame 
of interest, the ability to join a strong promoter with a control region from a 
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diffmnt promoter which allows for inducible transcription, ease f 
construction, and the like. Of particular interest are promoters which are 
activated in the presence of galactose. Galactose-inducible promoters (OALl, 
GAL7, and GALIO) have been extensively utilized for high level and regulated 
5 expression of protein in yeast (Lue et aL, MoL Cell Biol Vol. 7, p. 3446, 1987; 

Johnston, Microbiol Rev. Vol. 51, p. 458, 1987). Transcription from the GAL 
promoters is activated by the GAL4 protein, which binds to the promoter region 
and activates transcription when galactose is present In the absence of 
galactose, the antagonist OAL80 binds to GAL4 and prevents GAL4 from 
10 activating transcription. Addition of galactose prevents GAL80 from inhibiting 

activation by GAL4. 

Nucleotide sequences surrounding the translational initiation codon 
ATG have been found to affect expression in yeast cells. If the desired 
polypeptide is poorly expressed in yeast, the nucleotide sequences of exogenous 
1 5 genes can be modified to include an efficient yeast translation initiation 

sequence to obtain optimal gene expression. For expression in SaccharomyceSj 
this can be done by site-directed mutagenesis of an inefficiently expressed gene 
by fusing it in-frame to an endogenous Saccharomyces gene, preferably a highly 
expressed gene, such as the lactase gene. 

20 The termination region can be derived from the 3* region of the gene 

from which the initiation region was obtained or from a different gene. A large 
number of termination regions are known to and have been foimd to be 
satisfactory in a variety of hosts from the same and different genera and species. 
The termination region usually is selected more as a matter of convenience 

2S rather than because of any particular property. Preferably, the termination 

region is derived from a yeast gene, particularly Saccharomyces ^ 
Schizosaccharomyces^ Candida or Kluyveromyces. The V regions of two 
mammalian genes, y interferon and a2 interferon, are also known to function in 
yeast 
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INTRODUCTION OF CONSTRUCTS INTO HOST CELLS 

Constructs comprising the gene of interest may be introduced into a host 
cell by standard techniques. These techniques include transformation, 
protoplast fusion, lipofection, transfection, transduction, conjugation, infection, 
S holistic impact, electroporation, microinjection, scraping, or any other method 

which introduces the gene of interest into the host cell. Methods of 
transformation which are used include lithium acetate transformation (Methods 
in Enzymology, Vol. 194, p. 186-187, 1991). For convenience, a host cell which 
has been manipulated by any method to take up a DN A sequence or construct 
10 will be referred to as "transformed" or "recombinant" herein. 

The subject host will have at least have one copy of the expression 
construct and may have two or more, depending upon whether the gene is 
integrated into the genome, amplified, or is present on an extrachromosomal 
element having multiple copy numbers. Where the subject host is a yeast, four 

15 principal types of yeast plasmid vectors can be used: Yeast Integrating plasmids 

(Yips), Yeast Replicating plasmids (YRps), Yeast Centromere plasmids 
(YCps), and Yeast Episomal plasmids (YEps). Yips lack a yeast replication 
origin and must be propagated as integrated elements in the yeeist genome. 
YRps have a chromosomally derived autonomously replicating sequence and 

20 are propagated as medium copy number (20 to 40), autonomously replicating, 

unstably segregating plasmids. YCps have both a replication origin and a 
centromere sequence and propagate as low copy number (10-20), autonomously 
replicating, stably segregating plasmids. YEps have an origin of replication 
from the yeast 2|xm plasmid and are propagated as high copy number, 

2S autonomously replicating, irregularly segregating plasmids. The presence of the 

plasmids in yeast can be ensured by maintaining selection for a marker on the 
plasmid. Of particular interest are the yeast vectors pYES2 (a YEp plasmid 
available from Invitrogen, confers uracil prototrophy and a GALl galactose- 
inducible promoter for expression), pRS425-pGl (a YEp plasmid obtained from 

30 Dr. T. H. Chang, Ass. Professor of Molecular Genetics, Ohio State University, 

containing a constitutive GPD promoter and conferring leucine prototrophy), 
and pYX424 (a YEp plasmid having a constitutive TPl promoter and conferring 
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leucine prototrophy; Alber, T. and Kawasaki, G. (1982). J. Mol & AppL 
Genetics 1: 419). 

The transformed host cell can be identified by selection for a marker 
contained on the introduced construct. Alternatively, a separate marker 
S construct may be introduced with the desired construct, as many transformation 

techniques introduce many DNA molecules into host cells. Typically, 
transformed hosts are selected for their ability to grow on selective media. 
Selective media may incorporate an antibiotic or lack a factor necessary for 
growth of the untransformed host, such as a nutrient or growth &ictor. An 

1 0 introduced marker gene therefor may confer antibiotic resistance, or encode an 

essential growth factor or enzyme, and permit growth on selective media when 
expressed in the transformed host Selection of a transformed host can also 
occur when the expressed marker protein can be detected, either directly or 
indirectly. The miarker protein may be expressed alone or as a fusion to another 

1 S protein. The marker protein can be detected by its enzymatic activity; for 

example P galactosidase can convert the substrate X-gal to a colored product, 
and luciferase can convert luciferin to a light-emitting product. The marker 
protein can be detected by its light-producing or modifying characteristics; for 
example, the green fluorescent protein of Aequorea victoria fluoresces when 

20 illuminated with blue light. Antibodies can be used to detect the marker 

protein or a molecular tag on, for example, a protein of interest. Cells 
expressing the marker protein or tag can be selected, for example, visually, or 
by techniques such as FACS or panning using antibodies. For selection of yeast 
transformants, any marker that functions in yeast may be used. Desirably, 

25 resistance to kanamycin and the amino glycoside G41 8 are of interest, as well as 

ability to grow on media lacking uracil, leucine, lysine or tryptophan. 

Of particular interest is the A6- and A12-desaturase-mediated production 
of PUFAs in prokaryotic and eukaryotic host cells. Prokaryotic cells of interest 
include Eschericia, Bacillus^ Lactobacillus y cyanobacteria and the like. 
30 Eukaryotic cells include mammalian cells such as those of lactating animals, 

avian cells such as of chickens, and other cells amenable to genetic 
manipulation including insect, fungal, and algae cells. The cells may be 
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ciiltured or fonned as part or all of a host organism including an animal. 
Viruses and bacteriophage also may be used with the cells in the production of 
PUFAs, particularly for gene transfer, celliilar targeting and selection. In a 
preferred embodiment, the host is any microorganism or animal which produces 
5 and/or can assimilate exogenously supplied substrate(s) for a A6- and/or A12- 

desaturase, and preferably produces large amoimts of one or more of the 
substrates. Examples of host animals include mice, rats, rabbits, chickens, quail, 
tuiiceys, bovines, sheep, pigs, goats, yaks, etc., which are amenable to genetic 
manipulation and cloning for rapid expansion of the transgene expressing 
1 0 population. For animals, the desaturase transgene(s) can be adapted for 

expression in target organelles, tissues and body fluids through modification of 
the gene regulatory regions. Of particular interest is the production of PUFAs 
in the breast milk of the host animal. 



Ex pression In Yeast 

1 5 Examples of host microorganisms include Saccharomyces cerevisiae^ 

Saccharomyces carlsbergensis^ or other yeast such as Candida, Kluyveromyces 
or other fungi, for example, filamentous fungi such as Aspergilliis, Neurospora, 
Pemcillium, etc. Desirable characteristics of a host microorganism are, for 
example, that it is genetically well characterized, can be used for high level 

20 expression of the product using ultra-high density fermentation, and is on the 

GRAS (generally recognized as safe) list since the proposed end product is 
intended for ingestion by humans. Of particular interest is use of a yeast, more 
particularly baker's yeast (51 cerevisiae\ as a cell host in the subject invention. 
Strains of particular interest are SC334 (Mat a pep4-3 prbl-1 122 ura3-52 leu2- 

25 3, 1 12 regl-501 gall; Gene 83:57-64, 1989, Hovland P. et al), YTC34 (a ade2- 

101 his3A200 lys2.801 ura3-52; obtained from Dr. T. H. Chang, Ass. Professor 
of Molecular Genetics, Ohio State University), YTC41 (a/a ura3-52/ura3=52 
Iys2-801/lys2-801 ade2-101/ade2-101 trpl-Al/trpl-Al his3A200/his3A200 
leu2Al/leu2Al ; obtained from Dr. T. H. Chang, Ass. Professor of Molecular 

30 Genetics, Ohio State University), BJ1995 (obtained from the Yeast Genetic 
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Stock Centre, 1021 Donner Laboratory, Berkeley, CA 94720), INVSCl (Mat a 
hiw3Al leu2 trpl-289 ura3-S2; obtained from Invitrogen, 1600 Faraday Ave., 
Carlsbad, CA 92008) and INVSC2 (Mat a his3A200 ura3-167; obtained from 
Invitrogen). 

S Expression in Avian Species 

For producing PUFAs in avian species and cells, such as chickens, 
turiceys, quail and ducks, gene transfer can be performed by introducing a 
nucleic acid sequence encoding a A6 and/or A12-desaturase into the cells 
following procedures known in the art. If a transgenic animal is desired, 

10 pluripotent stem cells of embryos can be provided with a vector carrying a 

desaturase encoding transgene and developed into adult animal (USPN 
5,162,215; Ono et aL (1996) Comparative Biochemistry and Physiology A 
/7i(3):287-292; WO 9612793; WO 9606160). In most cases, the transgene 
will be modified to express high levels of the desaturase in order to increase 

15 production of PUFAs. The transgene can be modified, for example, by 

providing transcriptional and/or translational regulatory regions that frmction in 
avian cells, such as promoters which direct expression in particular tissues and 
egg parts such as yolk. The gene regulatory regions can be obtained from a 
variety of sources, including chicken anemia or avian leukosis viruses or avian 

20 genes such as a chicken ovalbumin gene. 



Expression in Insect Cells 

Production of PUFAs in insect cells can be conducted using baculovirus 
expression vectors harboring one or more desaturase transgenes. Baculovirus 
expression vectors are available from several conmiercial sources such as 
25 Clonetech. Methods for producing hybrid and transgenic strains of algae, such 

as marine algae, which contain and express a desaturase transgene also are 
provided. For example, transgenic marine algae may be prepared as described 
in USPN 5,426,040. As with the other expression systems described above, the 
timing, extent of expression and activity of the desaturase transgene can be 
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regulated by fitting the polypqptide coding sequence with the appropriate 
transcriptional and translational regulatory regions selected for a pardcidar use. 
Of particular interest are promoter regions which can be induced under 
preselected growth conditions. For example, introduction of temperature 
5 sensitive and/or metabolite responsive mutations into the desaturase transgene 

coding sequences, its regulatory regions, and/or the genome of cells into which 
the transgene is introduced can be used for this purpose. 

The transformed host cell is grown under appropriate conditions adapted 
for a desired end result. For host cells grown in culture, the conditions are 

10 typically optimized to produce the greatest or most economical yield of PUFAs, 

which relates to the selected desaturase activity. Media conditions which may 
be optimized include: carbon source, nitrogen source, addition of substrate, 
final concentration of added substrate, form of substrate added, aerobic or 
anaerobic growth, growth temperature, inducing agent, induction temperature, 

1 S growth phase at induction, growth phase at harvest, pH, density, and 

maintenance of selection. Microorganisms of interest, such as yeast are. 
preferably grown in selected medium. For yeast, complex media such as 
_ peptone broth ( YPD) or a defined media such as a minimal media (contains 
amino acids, yeast nitrogen base, and ammonium sulfate, and lacks a 

20 component for selection, for example uracil) are preferred. Desirably, 

substrates to be added are first dissolved in ethanol. Where necessary, 
expression of the polypeptide of interest may be induced, for example by 
including or adding galactose to induce expression firom a GAL promoter. 

Expression In Plants 

25 Production of PUFAs in plants can be conducted using various plant 

transformation systems such as the use of Agrobacterium tumefacienSy plant 
viruses, particle cell transformation and the like which are disclosed in 
Applicant's related applications U.S. Application Serial Nos. 08/834,033 and 
08/956,985 and continuation-in-part applications filed simultaneously with this 

30 application all of which are hereby incorporated by reference. 
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Expression In An Animal 

Expression in cells of a host animal can likewise be accomplished in a 
transient or stable manner. Transient expression can be accomplished via icnown 
methods, for example infection or iipofection, and can be repeated in order to 
5 maintain desired expression levels of the introduced construct (see Ebert, PCT 

publication WO 94/05782). Stable expression can be accomplished via 
integration of a construct into the host genome, resulting in a transgenic animal. 
The construct can be introduced^ for example, by microinjection of the construct 
into the pronuclei of a fertilized egg, or by transfection, retroviral infection or 

10 other techniques whereby the construct is introduced into a cell line which may 

form or be incorporated into an adult animal (U.S. Patent No. 4,873,191; U.S. 
Patent No. 5,530,177; U.S. Patent No. 5,565,362; U.S. Patent No. 5,366,894; 
Wilhnut et al (1997) Nature 385:810). The recombinant eggs or embryos are 
transferred to a surrogate mother (U.S. Patent No. 4,873,191; U.S. Patent No. 

15 5,530,177; U.S. Patent No. 5,565,362; U.S. Patent No. 5,366,894; Wilmut et al 

(supra)). 

After birth, transgenic animals are identified, for example, by the 
presence of an introduced marker gene, such as for coat color, or by PGR or 
Southem blotting from a blood, milk or tissue sample to detect the introduced 

20 construct, or by an immimological or enzymological assay to detect the 

expressed protein or the products produced therefrom (U.S. Patent No. 
4,873,191; U.S. Patent No. 5,530,177; U.S. Patent No. 5,565,362; U.S. Patent 
No. 5,366,894; Wilmut et al (supra)). The resulting transgenic animals may be 
entirely transgenic or may be mosaics, having the transgenes in only a subset of 

25 their cells. The advent of mammalian cloning, accomplished by fusing a 

nucleated cell with an enucleated egg, followed by transfer into a surrogate 
mother, presents the possibility of rapid, large-scale production upon obtaining 
a "founder" animal or cell comprising the introduced construct; prior to this, it 
was necessary for the transgene to be present in the germ line of the animal for 

30 propagation (Wilmut et al (supra)). 
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Expression in a host animal presents certain efficiencies, particularly 
where the host is a domesticated animal. For production of PUFAs in a fluid 
readily obtainable from the host animal» such as milk, the desaturase transgene 
can be expressed in mammary cells from a female host, and the PUPA content 
S of the host cells altered. The desaturase transgene can be adapted for expression 

so that it is retained in the mammary ceils, or secreted into milk, to form the 
PUPA reaction products localized to the milk (PCT publication WO 95/24488). 
Expression can be targeted for expression in mammary tissue using specific 
regulatory sequences, such as those of bovine a-iactalbimiin, a-casein, 

10 casein, y-casein, K-casein, p-lactoglobulin, or whey acidic protein, and may 

optionally include one or more introns and/or secretory signal sequences (U.S. 
Patent No. 5,530,177; Rosen, U.S. Patent No. 5,565,362; Clark et aL^ U.S. 
Patent No. 5,366,894; Gamer et aL, PCT publication WO 95/23868). 
Expression of desaturase transgenes, or antisense desaturase transcripts, adapted 

15 in this manner can be used to alter the levels of specific PUFAs, or derivatives 

thereof, found in the animals milk. Additionally, the desaturase transgene(s) 
can be expressed either by itself or with other transgenes, in order to produce 
animal milk containing higher proportions of desired PUFAs or PUFA ratios 
and concentrations that resemble human breast milk (Prieto et al^ PCT 

20 publication WO 95/24494). 

PURIFICATION OF FATTY ACIDS 

The desaturated fatty acids may be found in the host microorganism or 
animal as free fatty acids or in conjugated fomis such as acylglycerols, 
phospholipids, sulfolipids or glycolipids, and may be extracted from the host 

25 cell through a variety of means well-known in the art. Such means may include 

extraction with organic solvents, sonication, supercritical fluid extraction using 
for example carbon dioxide, and physical means such as presses, or 
combinations thereof Of particular interest is extraction with hexane or 
methanol and chloroform. Where desirable, the aqueous layer can be acidified 

30 to protonate negatively charged moieties and thereby increase partitioning of 

desired products into the organic layer. After extraction, the organic solvents 
can be removed by evaporation imder a stream of nitrogen. When isolated in 
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conjugated fonns, the products may be enzymatically or chemically cleaved to 
release the free fatty acid or a less complex conjugate of interest, and can then 
be subject to further manipulations to produce a desired end product. Desirably, 
conjugated fomis of &tty acids are cleaved with potassium hydroxide. 

S If further purification is necessary, standard methods can be employed. 

Such methods may include extraction, treatment with urea, fractional 
crystallization, HPLC, fractional distillation, silica gel chromatography, high 
speed centrifugation or distillation, or combinations of these techniques. 
Protection of reactive groups, such as the acid or alkenyl groups, may be done at 
1 0 any step through known techniques, for example alkylation or iodination. 

Methods used include methylation of the fatty acids to produce methyl esters. 
Similarly, protecting groups may be removed at any step. Desirably, 
purification of fractions containing GLA, SDA, ARA, DHA and EPA may be 
accomplished by treatment with urea and/or fractional distillation. 

1 S USES OF FATTY ACIDS 

The fatty acids of the subject invention finds many applications. Probes 
based on the DN As of the present invention may find use in methods for 
isolating related molecules or in methods to detect organisms expressing 
desaturases. When used as probes, the DNAs or oligonucleotides must be 

20 detectable. This is usually accomplished by attaching a label either at an 

internal site, for example via incorporation of a modified residue, or at the 5' or 
3' terminus. Such labels can be directly detectable, can bind to a secondary 
molecule that is detectably labeled, or can bind to an unlabelled secondary 
molecule and a detectably labeled tertiary molecule; this process can be 

25 extended as long as is practical to achieve a satisfactorily detectable signal 

without unacceptable levels of background signal. Secondary, tertiary, or 
bridging systems can include use of antibodies directed against any other 
molecule, including labels or other antibodies, or can involve any moleciiles 
which bind to each other, for example a biotin-streptavidin/avidin system. 

30 Detectable labels typically include radioactive isotopes, molecules which 

chemically or enzymatically produce or alter light, en2ymes which produce 
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detectable reaction products, magnetic molecules, fluorescent m lecules or 
molecules whose fluorescence or light-emitting characteristics change upon 
binding. Examples of labelling methods can be found in USPN 5,0 1 1 ,770. 
Alternatively, the binding of target molecules can be directly detected by 
S measuring the change in heat of solution on binding of probe to target via 

isothermal titration calorimetry, or by coating the probe or target on a surface 
and detecting the change in scattering of light firom the surface produced by 
binding of target or probe, respectively, as may be done with the BIAcore 
system. 

1 0 PUF As produced by recombinant means find applications in a wide 

variety of areas. Supplementation of animals or humans with PUF As in various 
forms can result in increased levels not only of the added PUF As but of their 
metabolic progeny as well. 

NUTRITIONAL COMPOSITIONS 

IS The present invention also includes nutritional compositions. Such 

compositions, for purposes of the present invention, include any food or 
preparation for human consumption including for enteral or parenteral 
consumption, which when taken into the body (a) serve to nourish or build up 
tissues or supply energy and/or (b) maintain, restore or support adequate 

20 nutritional status or metabolic function. 

The nutritional composition of the present invention comprises at least 
one oil or acid produced in accordance with the present invention and may 
either be in a solid or liquid form. Additionally, the composition may include 
edible macronutrients, vitamins and minerals in amounts desired for a particular 
25 use. The amount of such ingredients will vary depending on whether the 

composition is intended for use with normal, healthy in&nts, children or adults 
having specialized needs such as those which accompany certain metabolic 
conditions (e.g., metabolic disorders). 

Examples of macronutrients which may be added to the composition 
30 include but are not limited to edible fats, carbohydrates and proteins. Examples 
of such edible fats include but are not limited to coconut oil, soy oil, and mono- 
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and diglycerides. Examples of such carbohydrates include but are not limited to 
glucose, edible lactose and hydroiy2»d search. Additionally, examples of 
proteins which may be utilized in the nutritional composition of the invention 
include but are not limited to soy proteins, electrodialysed whey , 
S electrodialysed skim milk, milk whey, or the hydrolysates of these proteins. 

With respect to vitamins and minerals, the following may be added to 
the nutritional compositions of the present invention: calcium, phosphorus, 
potassium, sodium, chloride, magnesium, manganese, iron, copper, zinc, 
selenium, iodine, and Vitamins A, E, D, C, and the B complex. Other such 
1 0 vitamins and minerals may also be added. 

The components utilized in the nutritional compositions of the present 
invention will of semi-purified or purified origin. By semi-purified or purified 
is meant a material which has been prepared by purification of a natural 
material or by synthesis. 

1 5 Examples of nutritional compositions of the present invention include 

but are not limited to infant formulas, dietary supplements, and rehydration 
compositions. Nutritional compositions of particular interest include but are not 
limited to those utilized for enteral and parenteral supplementation for infants, 
specialist infant formulae, supplements for the elderly, and supplements for 

20 those with gastrointestinal difficulties and/or malabsorption. 

Nutritional Compositions 

A typical nutritional composition of the present invention will contain 
edible macronutrients, vitamins and minerals in amounts desired for a particular 
use. The amounts of such ingredients will vary depending on whether the 

25 formulation is intended for use with normal, healthy individuals temporarily 

exposed to stress, or to subjects having specialized needs due to certain chronic 
or acute disease states (e.g., metabolic disorders). It will be understood by 
persons skilled in the art that the components utilized in a nutritional 
formulation of the present invention are of semi-purified or purified origin. By 

30 semi-purified or purified is meant a material that has been prepared by 
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purification of a natural material or by synthesis. These techniques are well 
known in the art (See, e.g.. Code of Federal Regulations for Food Ingredients 
and Food Processing; Recommended Dietary Allowances, 10^ Ed., National 
Academy Press, Washington, D.C., 1989). 

S In a preferred embodiment, a nutritional formulation of the present 

invention is an enteral nutritional product, more preferably an adult or child 
enteral nutritional product. Accordingly in a further aspect of the invention, a 
nutritional formulation is provided that is suitable for feeding adults or children, 
who are experiencing stress. The formula comprises, in addition to the PUFAs 
10 of the invention; macronutrients, vitamins and minerals in amounts designed to 
provide the daily nutritional requirements of adults. 

The macronutritional components include edible fats, carbohydrates and 
proteins. Exemplary edible fats are coconut oil, soy oil, and mono- and 
diglycerides and the PUFA oils of this invention. Exemplary carbohydrates are 

15 glucose, edible lactose and hydrolyzed cornstarch. A typical protein source 

would be soy protein, electrodialysed whey or electrodialysed skim milk or milk 
whey, or the hydrolysates of these proteins, although other protein sources are 
also available and may be used. These macronutrients would be added in the 
form of commonly accepted nutritional compounds in amount equivalent to 

20 those present in hxmian milk or an energy basis, i.e., on a per calorie basis. 

Methods for formulating liquid and enteral nutritional formulas are well 
known in the art and are described in detail in the examples. 

The enteral fomiula can be sterilized and subsequently utilized on a 
ready-to-feed (RTF) basis or stored in a concentrated liquid or a powder. The 

25 powder can be prepared by spray dzying the enteral formula prepared as 

indicated above, and the formula can be reconstituted by rehydrating the 
concentrate. Adult and infant nutritional formulas are well known in the art and 
conunercially available (e.g., Similac®, Ensure®, Jevity® and Alimentum® 
from Ross Products Division, Abbott Laboratories). An oil or acid of the 

30 present invention can be added to any of these formulas in the amounts 
described below. 
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The energy density of the nutritional composition when in liquid form, 
can typically range from about 0.6 to 3.0 Kcal per ml. When in solid or 
powdered form, the nutritional supplement can contain &om about 1.2 to more 
than 9 Kcals per gm, preferably 3 to 7 Kcals per gm. In general, the osmolality 
5 of a liquid product should be less than 700 mOsm and more preferably less than 

660 mOsm. 

The nutritional formula would typically include vitamins and minerals, 
in addition to the PUFAs of the invention, in order to help the individual ingest 
the minimum daily requirements for these substances. In addition to the PUFAs 

1 0 listed above, it may also be desirable to supplement the nutritional composition 

with zinc, copper, and folic acid in addition to antioxidants. It is believed that 
these substances will also provide a boost to the stressed immune system and 
thus will provide further benefits to the individual. The presence of zinc, 
copper or folic acid is optional and is not required in order to gain the beneficial 

IS effects on inunune suppression. Likewise a pharmaceutical composition can be 
supplemented with these same substances as well. 

In a more preferred embodiment, the nutritional contains, in addition to 
the antioxidant system and the PUFA component, a source of carbohydrate 
wherein at least 5 weight % of said carbohydrate is an indigestible 
20 oligosaccharide. In yet a more preferred embodiment, the nutritional 

composition additionally contains protein, taurine and carnitine. 

The PUFAs, or derivatives thereof, made by the disclosed method can 
be used as dietary substitutes, or supplements, particularly infant formulas, for 
patients undergoing intravenous feeding or for preventing or treating 

25 malnutrition. Typically, hiraian breast milk has a fatty acid profile comprising 

from about 0.15 % to about 0.36 % as DHA, from about 0.03 % to about 0.13 % 
as EPA, from about 0.30 % to about 0.88 % as ARA, from about 0.22 % to 
about 0.67 % as DGLA, and from about 0.27 % to about 1.04 % as GLA. 
Additionally, the predominant triglyceride in human milk has been reported to 

30 be l,3-di-oleoyl-2-'palmitoyl, with 2-palmitoyl glycerides reported as better 

absorbed than 2-oleoyl or 2-lineoyl glycerides (USPN 4,876,107). Thus, fatty 
acids such as ARA, DGLA, GLA and/or EPA produced by the invention can be 
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used to alter the composition of infant fonnuias to better replicate the PUFA 
composition of human breast miUc. In particular, an oil composition for use in a 
pharmacologic or food supplement, particularly a breast milk substitute or 
supplement, will preferably comprise one or more of ARA, DGLA and GLA. 
S More preferably the oil will comprise from about 0.3 to 30% ARA, from about 

0.2 to 30% DGLA, and from about 0.2 to about 30% GLA. 

In addition to the concentration, the ratios of ARA, DGLA and GLA can 
be adapted for a particular given end use. When formulated as a breast milk 
supplement or substitute, an oil composition which contains two or more of 

1 0 ARA, DGLA and GLA will be provided in a ratio of about 1 : 1 9:30 to about 

6: 1 :0.2, respectively. For example, the breast milk of animals can vary in ratios 
of ARA:DGLA:DGL ranging from 1:19:30 to 6:1:0.2, which includes 
intermediate ratios which are preferably about 1 : 1 : 1 , 1 :2: 1 , 1 : 1 :4. When 
produced together in a host cell, adjusting the rate and percent of conversion of 

IS a precursor substrate such as GLA and DGLA to ARA can be used to precisely 

control the PUFA ratios. For example, a 5% to 10% conversion rate of DGLA 
to ARA can be used to produce an ARA to DGLA ratio of about 1:19, whereas 
a conversion rate of about 75% to 80% can be used to produce an ARA to 
DGLA ratio of about 6:1. Therefore, whether in a cell cultuie system or in a 

20 host animal, regulating the timing, extent and specificity of desaturase 

expression as described can be used to modulate the PUFA levels and ratios. 
Depending on the expression system used, e.g., cell culture or an animal 
expressing oil(s) in its milk, the oils also can be isolated and recombined in the 
desired concentrations and ratios. Amounts of oils providing these ratios of 

25 PUFA can be determined following standard protocols. PUFAs, or host cells 

containing them, also can be used as animal food supplements to alter an 
animal's tissue or milk fatty acid composition to one more desirable for human 
or animal consumption. 

For dietary supplementation, the purified PUFAs, or derivatives thereof, 
30 may be incorporated into cooking oils, fats or margarines formulated so that in 

normal use the recipient would receive the desired amoxmt. The PUFAs may 



-38- 



wo 98/46763 



PCTAJS98/07126 



also be incorporated into infant formulas, nutritional siq>plements or other food 
products, and may find use as anti-inflammatory or cholesterol lowering agents. 

Pharmaceutical Compositions 

The present invention also encompasses a pharmaceutical composition 
5 comprising one or more of the acids and/or resulting oils produced in 

accordance with the methods described herein. More specifically, such a 
pharmaceutical composition may comprise one or more of the acids and/or oils 
as well as a standard, well-known, non-toxic pharmaceutically acceptable 
carrier, adjuvant or vehicle such as, for example, phosphate buffered saline, 
10 water, ethanol, polyols, vegetable oils, a wetting agent or an emulsion such as a 

water/oil emulsion. The composition may be in either a liquid or solid form. 
For example, the composition may be in the form of a tablet, capsule, ingestible 
liquid or powder, injectible, or topical ointment or cream. 

Possible routes of administration include, for example, oral, rectal and 
15 parenteral. The route of administration will, of course, depend upon the desired 

effect. For example, if the composition is being utilized to treat rough, dry, or 
aging skin, to treat injured or burned skin, or to treat skin or hair affected by a 
disease or condition, it may perhaps be applied topically. 

The dosage of the composition to be administered to the patient may be 
20 determined by one of ordinary skill in the art and depends upon various factors 

such as weight of the patient, age of the patient, immune status of the patient, 
etc. 

With respect to form, the composition may be, for example, a solution, a 
dispersion, a suspension, an emulsion or a sterile powder which is then 
25 reconstituted. 

Additionally, the composition of the present invention may be utilis^ed 
for cosmetic purposes. It may be added to pre-existing cosmetic compositions 
such that a mixture is formed or may be used as a sole composition. 
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Pharmaceutical compositions may be utilized to administer the PUFA 
component to an individual. Suitable pharmaceutical compositions may 
comprise physiologically acceptable sterile aqueous or non-aqueous solutions^ 
dispersions, suspensions or emulsions and sterile powders for reconstitution into 
S sterile solutions or dispersions for ingestion. Examples of suitable aqueous and 

non-aqueous carriers, diluents, solvents or vehicles include water, ethanol, 
polyols (propyleneglycol, polyethyleneglycol, glycerol, and the like), suitable 
mixtures thereof, vegetable oils (such as olive oil) and injectable organic esters 
such as ethyl oleate. Proper fluidity can be maintained, for example, by the 
1 0 maintenance of the required particle size in the case of dispersions and by the 

use of siirfactants. It may also be desirable to include isotonic agents, for 
example sugars, sodium chloride and the like. Besides such inert diluents, the 
composition can also include adjuvants, such as wetting agents, emulsifying and 
suspending agents, sweetening, flavoring and perfuming agents. 

1 S Suspensions, in addition to the active compounds, may contain 

suspending agents, as for example, ethoxylated isostearyl alcohols, 
polyoxyethylene sorbitol and sorbitan esters, microcrystalline cellulose, 
aluminum metahydroxide, bentonite, agar-agar and tragacanth or mixtures of 
these substances, and the like. 

20 Solid dosage forms such as tablets and capsules can be prepared using 

techniques well known in the art. For example, PUP As of the invention can be 
tableted with conventional tablet bases such as lactose, sucrose, and cornstarch 
in combination with binders such as acacia, cornstarch or gelatin, disintegrating 
agents such as potato starch or alginic acid and a lubricant such as stearic acid 

25 or magnesium stearate. Capsules can be prepared by incorporating these 

excipients into a gelatin capsule along with the antioxidants and the PUFA 
component. The amount of the antioxidants and PUFA component that should 
be incorporated into the pharmaceutical formulation should fit within the 
guidelines discussed above. 

30 As used in this application, the term "treat" refers to either preventing, or 

reducing the incidence of, the undesired occurrence. For example, to treat 
irrunime suppression refers to either preventing the occurrence of this 
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suppression or reducing the amount of such suppression. The terms "patient" 
and "individual" are being used interchangeably and both refer to an animal. 
The term "animal" as used in this implication refers to any warm-blooded 
manmial including, but not limited to, dogs, humans, monkeys, and apes. As 
5 used in the application the temi "about" refers to an amount varying from the 

stated range or number by a reasonable amount depending upon the context of 
use. Any numerical number or range specified in the specification should be 
considered to be modified by the temi about 

"Dose" and "serving" are used interchangeably and refer to the amount 
10 of the nutritional or pharmaceutical composition ingested by the patient in a 

single setting and designed to deliver effective amounts of the antioxidants and 
the structured triglyceride. As will be readily apparent to those skilled in the 
art, a single dose or serving of the liquid nutritional powder should supply the 
amotmt of antioxidants and PUFAs discussed above. The amoimt of the jdose or 
1 5 serving should be a volume that a typical adult can consume in one sitting. This 

amoimt can vary widely depending upon the age, weight, sex or medical 
condition of the patient. However as a general guideline, a single serving or 
dose of a liquid nutritional produce should be considered as encompassing a 
volume from 100 to 600 ml, more prefembly from 125 to 500 ml and most 
20 preferably from 125 to 300 ml. 

The PUFAs of the present invention may also be added to food even 
when supplementation of the diet is not required. For example, the composition 
may be added to food of any type including but not limited to margarines, 
modified butters, cheeses, milk, yogurt, chocolate, candy, snacks, salad oils, 
25 cooking oils, cooking fats, meats, fish and beverages. 

Pharmaceuticai APDlications 

For pharmaceutical use (human or veterinary), the compositions are 
generally administered orally but can be administered by any route by which 
they may be successfiiUy absorbed, e.g., parenterally (i.e. subcutaneously, 
30 intramuscularly or intravenously), rectally or vaginally or topically, for 

example, as a skin ointment or lotion. The PUFAs of the present invention may 
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be administered alone or in combination with a pharmaceutically acceptable 
carrier or excipient. Where available, gelatin capsules are the preferred form of 
oral administration. Dietary supplementation as set forth above also can 
provide an oral route of administration. The unsaturated acids of the present 
S invention may be administered in conjugated forms, or as salts, esters, amides 

or prodrugs of the fatty acids. Any pharmaceutically acceptable salt is 
encompassed by the present invention; especially preferred are the sodium, 
potassium or lithium salts. Also encompassed are the N»alkylpolyhydroxamine 
salts, such as N-methyl glucamine, foimd in PCT publication WO 96/33 155. 

1 0 The preferred esters are the ethyl esters. As solid salts, the PUFAs also can be 

administered in tablet form. For intravenous administration, the PUFAs or 
derivatives thereof may be incorporated into commercial formulations such as 
Intralipids. The tj^ical normal adult plasma fatty acid profile comprises 6.64 to 
9.46% of ARA, 1 .45 to 3. 1 1% of DGLA, and 0.02 to 0.08% of GLA. These 

1 5 PUFAs or their metabolic precursors can be administered, either alone or in 

mixtures with other PUFAs, to achieve a normal fatty acid profile in a patient 
Where desired, the individual components of formulations may be individually 
provided in kit form, for single or multiple use. A typical dosage of a particular 
fatty acid is from 0.1 mg to 20 g, or even 100 g daily, and is preferably from 10 

20 mg to 1, 2, S or 10 g daily as required, or molar equivalent amounts of 

derivative forms thereof. Parenteral nutrition compositions comprising from 
about 2 to about 30 weight percent fatty acids calculated as triglycerides are 
encompassed by the present invention; preferred is a composition having from 
about 1 to about 25 weight percent of the total PUFA composition as GLA 

25 (USPN 5,196,198). Other vitamins, and particularly fat-soluble vitamins such 

as vitamin A, D, E and L-camitine can optionally be included. Where desired, a 
preservative such as a tocopherol may be added, typically at about 0.1% by 
weight. 

Suitable pharmaceutical compositions may comprise physiologically 
30 acceptable sterile aqueous or non-aqueous solutions, dispersions, suspensions or 

emulsions and sterile powders for reconstitution into sterile injectible solutions 
or dispersions. Examples of suitable aqueous and non-aqeuous carriers, 
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diluents, solvents or vehicles include water, ethanol, polyols (propylleneglyol, 
polyethylenegycol, glycerol, and the like), suitable mixtures thereof, vegetable 
oils (such as olive oil) and injectable organic esters such as ehyl oleate. Proper 
fluidity can be maintained, for example, by the maintenance of the required 
S particle size in the case of dispersions and by the use of surfactants. It may also 

be desirable to include isotonic agents, for example sugars, sodium chloride and 
the like. Besides such inert diluents, the composition can also include 
adjuvants, such as wetting agents, emulsifying and suspending agents, 
sweetening, flavoring and perfuming agents. 

1 0 Suspensions in addition to the active compoimds, may contain 

suspending agents, as for example, ethoxylated isostearyl alcohols, 
polyoxyethylene sorbitol and sorbitan esters, microcrystalline cellulose, 
aluminum metahydroxide, bentonite, agar-agar and tragacanth, or mixtures of 
these substances and the like. 

15 An especially preferred pharmaceutical composition contains 

diacetyltartaric acid esters of mono- and diglycerides dissolved in an aqueous 
medium or solvent. Diacetyltartaric acid esters of mono- and diglycerides have 
an HLB value of about 9-12 and are significantly more hydrophilic than existing 
antimicrobial lipids that have HLB values of 2-4. Those existing hydrophobic 

20 lipids cannot be formulated into aqueous compositions. As disclosed herein, 

those lipids can now be solubilized into aqueoxis media in combination with 
diacetyltartaric acid esters of mono-and diglycerides. In accordance with this 
embodiment, diacetyltartaric acid esters of mono- and diglycerides (e.g., 
DATEM-C12:0) is melted with other active antimicrobial lipids (e.g., 18:2 and 

25 12:0 monoglycerides) and mixed to obtain a homogeneous mixture. 

Homogeneity allows for increased antimicrobial activity. The mixture can be 
completely dispersed in water. This is not possible without the addition of 
diacetyltartaric acid esters of mono- and diglycerides and premixing with other 
monoglycerides prior to introduction into water. The aqueous composition can 

30 then be admixed under sterile conditions with physiologically acceptable 

diluents, preservatives, buffers or propellants as may be required to form a spray 
or inhalant. 
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The present invention also encompasses the treatment of numerous 
disorders with fatty acids. Supplementation with PUFAs of the present 
invention can be used to treat restenosis after angioplasty. Symptoms of 
inflammation, rheumatoid arthritis, and asthma and psoriasis can be treated with 
S the PUFAs of the present invention. Evidence indicates that PUFAs may be 

involved in calcium metabolism, suggesting that PUFAs of the present 
invention may be used in the treatment or prevention of osteoporosis and of 
kidney or urinary tract stones. 

The PUFAs of the present invention can be used in the treatment of 
10 cancer. Malignant cells have been shown to have altered fatty acid 

compositions; addition of fatty acids has been shown to slow their growth and 
cause cell death, and to increase their susceptibility to chemotherapeutic agents. 
GLA has been shovm to cause reexpression on cancer cells of the E-cadherin 
cellular adhesion molecules, loss of which is associated with aggressive 
1 5 metastasis. Clinical testing of intravenous administration of the water soluble 

lithium salt of GLA to pancreatic cancer patients produced statistically 
significant increases in their survival. PUFA supplementation may also be 
useful for treating cachexia associated with cancer. 

The PUFAs of the present invention can also be used to treat diabetes 
20 (USPN 4,826,877; Horrobin et al. Am. J. Clin. Nutr. Vol. 57 (SuppL), 7328- 

737S). Altered fatty acid metabolism and composition has been demonstrated 
in diabetic animals. These alterations have been suggested to be involved in 
some of the long-term complications resulting from diabetes, including 
retinopathy, neuropathy, nephropathy and reproductive system damage. 
25 Primrose oil, which contains GLA, has been shown to prevent and reverse 

diabetic nerve damage. 

The PUFAs of the present invention can be used to treat eczema, reduce 
blood pressure and improve math scores. Essential fatty acid deficiency has 
been suggested as being involved in eczema, and studies have shown beneficial 
30 effects on eczema from treatment with GLA. GLA has also been shown to 

reduce increases in blood pressure associated with stress, and to improve 
performance on arithmetic tests. GLA and DGLA have been shown to inhibit 
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platelet aggregation; cause vasodilation, lower cholesterol levels and inhibit 
proliferation of vessel wall smooth muscle and fibrous tissue (Brenner et al , 
Adv. Exp. Med. Biol. Vol. 83, p. 85-101, 1976). Administration of GLA or 
DGLA, alone or in combination with EPA, has been shown to reduce or prevent 
S gastro-intestinal bleeding and other side effects caused by non-steroidal anti- 

inflammatoiy drugs (USPN 4,666,701). GLA and DGLA have also been shown 
to prevent or treat endometriosis and premenstrual syndrome (USPN 4,758,592) 
and to treat myalgic encq>halomyelitis and chronic fatigue after viral infections 
(USPN 5,116,871). 

1 0 Further uses of the PUFAs of this invention include use in treatment of 

AIDS, multiple schlerosis, acute respiratory syndrome, hypertension and 
inflammatory skin disorders. The PUFAs of the inventions also can be used for 
formulas for general health as well as for geriatric treatments. 

Veterinary Applications 

15 It should be noted that the above-described pharmaceutical and 

nutritional compositions may be utilized in connection with animals, as well as 
humans, as animals experience many of the same needs and conditions as 
human. For example, the oil or acids of the present invention may be utilized in 
animal feed supplements. 

20 The following examples are presented by way of illustration, not of 

limitation. 

Examples 

Example 1 Construction of a cDNA Library from Mortierella alpina 

Example 2 Isolation of a A6-desaturase Nucleotide Sequence from 
25 Mortierella alpina 

Example 3 Identification of A6-desaturases Homologous to the 
Mortierella alpina A6*desaturase 

Example 4 Isolation of a Al 2-desaturase Nucleotide Sequence from 
Mortierella Alpina 

^5- 
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Example S Expression of M alpina Desaturase Clones in Baker's 
Yeast 

Example 6 Initial Optimization of Culture Conditions 

Example 7 Distribution of PUF As in Yeast Lipid Fractions 

S Example 8 Fiuther Culture Optimization and Coexpression of A6 

and A12-desaturases 

Example 9 Identification of Homologues to M, alpina AS and A6 
desaturases 

Example 10 Identification of M alpina A5 and A6 homologues in 
1 0 other PUF A-producing organisms 

Example 1 1 Identification of M. alpina AS and A6 homologues in 
other PUFA-producing organisms 

Example 12 Human Desaturase Gene Sequences 

Example 13 Nutritional Compositions 



15 



Example 1 



Construction of a cDNA Library from Mortierella aloina 

Total RNA was isolated from a 3 day old PUFA-producing culture of 
Mortierella alpina using the protocol of Hoge et al (1982) Experimental 

20 Mycology 6:225-232. The RNA was used to prepare double-stranded cDNA 

using BRL's lambda-ZipLox system following the manufactures instructions. 
Several size fractions of the M. alpina cDNA were packaged separately to yield 
libraries with different average-sized inserts. A '*full-length" library contains 
approximately 3x10^ clones with an average insert size of 1.77 kb. The 

25 "sequencing-grade" library contains approximately 6x10^ clones with an 

average insert size of 1 . 1 kb. 
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Example 2 

Isolation of a A6-dcsaturase Nucleotide Sequence from MortiereUa Aloina 

A nucleic acid sequence from a partial cDNA clone, MaS24, encoding a 
A6 fatty acid desaturase from MortiereUa alpina was obtained by random 
S sequencing of clones from the M alpina cDNA sequencing grade library 

described in Example 1 . cDNA-containing plasmids were excised as follows: 

Five ^l of ph^e were combined with 100 ^1 ofE. coli DH10B(ZIP) 
grown in ECLB plus 10 ^g/ml kanamycin, 0.2% maltose, and 10 mM MgS04 
and incubated at 37 degrees for IS minutes. 0.9 ml SOC was added and 100 \i\ 

10 of the bacteria inunediately plated on each of 1 0 ECLB + SO ^g Pen plates. No 

4S minute recovery time was needed. The plates were incubated overnight at 
37°. Colonies were picked into ECLB + 50 ^g Pen media for overnight cultures 
to be used for making glycerol stocks and miniprep DNA. An aliquot of the 
culture used for the miniprep is stored as a glycerol stock. Plating on ECLB + 

IS SO |ig Pen/ml resulted in more colonies and a greater proportion of colonies 

containing inserts than plating on 100 |xg/ml Pen. 

Random colonies were picked and plasmid DNA purified using Qiagen 
miniprep kits. DNA sequence was obtained from the 5* end of the cDNA insert 
and compared to the National Center for Biotechnology Information (NCBI) 
20 nonredundant database using the BLASTX algorithm. Ma524 was identified as 

a putative desaturase based on DNA sequence homology to previously 
identified desaturases. 

A full-length cDNA clone was isolated from the M alpina fiiU-length 
library and designed pCGN5S32. The cDNA is contained as a 1617 bp insert in 
2S the vector pZLl (BRL) and, beginning with the first ATG, contains an open 

reading firame encoding 4S7 amino acids. The three conserved "histidine 
boxes" known to be conserved among membrane-bound deaturases (Okuley, et 
al. (1994) The Plant Cell 6:147-1 58) were found to be present at amino acid 
positions 172-176, 209-213, and 39S-399 (see Figure 3). As with other 
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membrane-bound A6-desaturases the final HXXHH histidine box motif was 
foimd to be QXXHH. The amino acid sequence of MaS24 was found to display 
significant homology to a portion of a Caenorhabditis elegcms cosmid, 
W06D2.4, a cytochrome bS/desaturase fusion protein firom sunflower, and the 
5 Synechocystis and Spirulina A 6-desaturases. In addition, Ma524 was shown to 

have homology to the borage A6-desaturase amino sequence (PCT publication 
W) 96/21022). MaS24 thus appears to encode a A6-desaturase that is related to 
the borage and algal A6-desaturases. The peptide sequences are shown as SEQ 
IDNO:5-SEQIDNO:ll. 

10 The amino terminus of the encoded protein was foimd to exhibit 

significant homology to cytochrome b5 proteins. The Mortierella cDNA clone 
appears to represent a fusion between a cytochrome b5 and a fatty acid 
desaturase. Since cytochrome b5 is believed to function as the electron donor 
for membrane-boimd desaturase enzymes, it is possible that the N-terminal 

15 cytochrome b5 domain of this desaturase protein is involved in its function. 

This may be advantageous when expressing the desaturase in heterologous 
systems for PUFA production. However, it should be noted that, although the 
amino acid sequences of Ma524 and the borage A6 were found to contain 
regions of homology, the base compositions of the cDNAs were shown to be 

20 significantly different. For example, the borage cDNA was shown to have an 

overall base composition of 60 % A/T, with some regions exceeding 70 %, 
while Ma524 was shown to have an average of 44 % A/T base composition, 
with no regions exceeding 60 %. This may have implications for expressing the 
cDNAs in microorganisms or animals which favor different base compositions. 

25 It is known that poor expression of recombinant genes can occur when the host 

prefers a base composition different from that of the introduced gene. 
Mechanisms for such i)Oor expression include decreased stability, cryptic splice 
sites, and/or translatability of the mRNA and the like. 
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Examples 

Identification of A6"desatttrases Homologous to the 
Mortierella alpina A6-desatnrase 

Nucleic acid sequences that encode putative A6-desaturases were 

5 identified through a BL ASTX search of the Expressed Sequence Tag ("EST") 

databases through NCBI using the MaS24 amino acid sequence. Several 
sequences showed significant homology. In particular, the deduced amino acid 
sequence of two Arabidopsis thaliana sequences, (accession numbers F13728 
and T42806) showed homology to two different regions of the deduced amino 

10 acid sequence of Ma524. The following PGR primers were designed: 

ATTS4723-FOR (complementary to F13728) SEQ ID NO:13 
5' CUACUACUACUAGGAGTCCTCTACGOTGTTTTG and 
T42806-REV (complementary to T42806) SEQ ID NO:14 
5* CAUCAUCAUCAUATGATGCTCAAGCTGAAACTG. Five jig of total 

1 5 RNA isolated firom developing siliques of Arabidopsis thaliana was reverse 

transcribed using BRL Superscript RTase and the primer TSyn 
(S'-CCAAGCTTCTGCAGGAGCTCrrrrri 11 1 1 1 ill 1-3') and is shown as 
SEQ ID NO: 12. PGR was carried out in a SO ul volume containing: template 
derived from 25 ng total RNA, 2 pM each primer, 200 each 

20 deoxyribonucleotide triphosphate, 60 mM Tris-Gl, pH 8.5, 15 mM (NH4)2S04, 

2 mM MgGh, 0.2 U Taq Polymerase. Thermocycler conditions were as 
follows: 94 degrees for 30 sec, 50 degrees for 30 sec, 72 degrees for 30 sec 
PGR was continued for 35 cycles followed by an additional extension at 72 
degrees for 7 minutes. PGR resulted in a fragment of approximately -750 base 

25 pairs which was subcloned, named 12-5, and sequenced. Each end of this 

fragment was formed to correspond to the Arabidopsis ESTs from which the 
PGR primers were designed. The putative amino acid sequence of 12-5 was 
compared to that of Ma524, and ESTs from human (W28140), mouse 
(W53753), and C elegans (R05219) (see Figure 4). Homology patterns with 

30 the Mortierella A6- desaturase indicate that these sequences represent putative 
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desaturase polypeptides. Based on this experiment iqiproach, it is likely that the 
fiill-length genes can be cloned using probes based on the EST sequences. 
Following the cloning, the genes can then be placed into expression vectors, 
expressed in host cells, and their specific A6- or other desaturase activity can be 
S determined as described below. 

Example 4 

Isolation of a A12»desaturase Nucleotide Sequence from MortiereUa alpina 

Based on the fatty acids it accumulates, it seemed probable that 
MortiereUa alpina has an a>6 type desaturase. The a>6-desaturase is responsible 
1 0 for the production of linoleic acid (1 8:2) from oleic acid (1 8: 1 ). Linoleic acid 

(1 8:2) is a substrate for a A6-desaturase. This experiment was designed to 
determine MortiereUa alpina has a A12-desaturase polypeptide, and if so, to 
identify the corresponding nucleotide sequence. 

A random colony from the M alpina sequencing grade library, Ma648, 
1 S was sequenced and identified as a putative desaturase based on DNA sequence 

homology to previously identified desaturases, as described for MaS24 (see 
Example 2). The nucleotide sequence is shown in SEQ ID NO: 13. The peptide 
sequence is shown in SEQ ID NO:4. The deduced amino acid sequence from 
the S* end of the Ma648 cDNA displays significant homology to soybean 
20 microsomal o>6 (A12) desaturase (accession #L43921) as well as castor bean 

oleate 12-hydroxylase (accession #U22378). In addition, homology was 
observed when compared to a variety of other (d6 (A12) and co3 (Al S) &tty acid 
desaturase sequences. 
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Example S 

Expression of M. alpina Desaturasc Clones in Baker's Yeast 

Yeast Transformation 

Lithium acetate transfonnation of yeast was perfomied according to 
5 standard protocols (Methods in Enzymology^ Vol. 194, p. 186-187, 1991), 

Briefly, yeast were grown in YPD at 30®C. Cells were spun down, resuspended 
in TE, spun down again, resuspended in TE containing 100 mM lithium acetate, 
spun down again, and resuspended in TE/lithium acetate* The resuspended 
yeast were incubated at 30^C for 60 minutes with shaking. Carrier DNA was 

1 0 added, and the yeast were aliquoted into tubes. Transforming DNA was added, 

and the tubes were incubated for 30 min. at aO^C. PEG solution (35% (w/v) 
PEG 4000, 100 mM lithium acetate, TE pH7.S) was added followed by a SO 
min. incubation at 30^C. A 5 min. heat shock at 42^C was performed, the cells 
were pelleted, washed with TE, pelleted again and resuspended in TE. The 

1 S resuspended cells were then plated on selective media. 

Desaturase Expression in Transformed Yeast 

cDNA clones from Mortierella alpina were screened for desaturase 
activity in baker's yeast. A canola Al 5-desaturase (obtained by PCR using 1^ 
strand cDNA from Brassica napus cultivar 212/86 seeds using primers based on 

20 the published sequence (Arondel et al. Science 258:1353-1355)) was used as a 

positive control. The A15-desaturase gene and the gene from cDNA clones 
Ma524 and Ma648 were put in the expression vector pYES2 (Invitrogen), 
resulting in plasmids pCGR-2, pCGR-5 and pCGR-7, respectively. These 
plasmids were transfected into 5. cerevisiae yeast strain 334 and expressed after 

25 induction with galactose and in the presence of substrates that allowed detection 

of specific desaturase activity. The control strain was 5. cerevisiae strain 334 
containing the unaltered pYES2 vector. The substrates used, the products 
produced and the indicated desaturase activity were: DGLA (conversion to 
ARA would indicate A5-desaturase activity), linoleic acid (conversion to GLA 
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would indicate A6-desaturase activity; conversion to ALA would indicate AIS- 
desaturase activity), oleic acid (an endogenous substrate made by S. cerevisiae, 
conversion to linoleic acid woiild indicate A12-desaturase activity, which S. 
cerevisiae lacks), or ARA (conversion to EPA would indicate A17-desaturase 
5 activity). 

Cultures were grown for 48-52 hours at IS^'C in the presence of a 
particular substrate. Lipid fractions were extracted for analysis as follows: 
Cells were pelleted by centrifugation, washed once with sterile ddH20, and 
repelleted. Pellets were vortexed with methanol; chloroform was added along 

10 with tritridecanoin (as an intemal standard). The mixtures were incubated for at 

least one hour at room temperature or at 4*^C overnight. The chloroform layer 
was extracted and filtered through a Whatman filter with one gram of anhydrous 
sodium sulfate to remove particulates and residual water. The organic solvents 
were evaporated at 40**C under a stream of nitrogen. The extracted lipids were 

15 then derivatized to fatty acid methyl esters (FAME) for gas chromatography 

analysis (GC) by adding 2 ml of 0.5 N potassium hydroxide in methanol to a 
closed tube. The samples were heated to 95**C to 100**C for 30 minutes and 
cooled to room temperature. Approximately 2 ml of 14 % boron trifluoride in 
methanol was added and the heating repeated. After the extracted lipid mixture 

20 cooled, 2 ml of water and 1 ml of hexane were added to extract the FAME for 

analysis by GC. The percent conversion was calculated by dividing the product 
produced by the sum of (the product produced and the substrate added) and then 
multiplying by 100. To calculate the oleic acid percent conversion, as no 
substrate was added, the total linoleic acid produced was divided by the sum of 

25 oleic acid and linoleic acid produced, then multiplying by 100. The desaturase 

activity results are provided in Table 1 below. 
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Table! 

M. aipina Desaturase Ejq^ression In Baker^s Yeasi 



CLONE 


cNZYME ACTIVITY 


% CONVERSION 

Or SUDOlKAlEr 


i>CGR*2 


A6 


0 (18:2 to 18:3w6) 


(canolaAlS 


A15 


16.3 (18:2 to 18:3 w3) 


desaturase) 


AS 


2.0 (20:3to20:4w6) 




A17 


2.8 (20:4 to 20:5w3) 




A12 


1.8 (18:1 to I8:2w6) 


pCGR-S 


A6 


6.0 


(M. aipina 


A15 


0 


Ma524 


A5 


2.1 




A17 


0 




A12 


3.3 


pCGR-7 


A6 


0 


(M. aipina 


A15 


3.8 


Ma648 


A5 


2.2 




A17 


0 




A12 


63.4 



The AlS-desaturase control clone exhibited 16.3% conversion of the 
S substrate. The pCGR-S clone expressing the MaS24 cDNA showed 6% 

conversion of the substrate to GLA, indicating that the gene encodes a A6- 
desaturase. The pCGR-7 clone expressing the Ma648 cDNA converted 63.4% 
conversion of the substrate to LA, indicating that the gene encodes a A12- 
desaturase. The background (non-specific conversion of substrate) was between 
10 0-3% in these cases. We also found substrate inhibition of the activity by using 

different concentrations of the substrate. When substrate was added to 100 jxM, 
the percent conversion to product dropped compared to when substrate was added 
to 25 |aM (see below). Additionally, by varying the substrate concentration 
between 5 jxM and 200 nM, conversion ratios were found to range between about 
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5% to about 75% greater. These data show that desaturases with different 
substrate specificities can be expressed in a heterologous system and used to 
produce poly-unsaturated long chain &tty acids. 

Table 2 represents fatty acids of interest as a percent of the total lipid 
S extracted fix>m the yeast host S. cerevisiae 334 with the indicated plasmid. No 

glucose was present in the growth media. AfBnity gas chromatogrsq>hy was used 
to separate the respective lipids. GC/MS was employed to verify the identity of 
the product(s). The expected product for the B. ruq?us AlS-desaturase, a- 
linolenic acid, was detected when its substrate, linoleic acid» was added 

1 0 exogenously to the induced yeast culture. This finding demonstrates that yeast 

expression of a desaturase gene can produce functional enzyme and detectable 
amounts of product under the current growth conditions. Both exogenously 
added substrates were taken up by yeast, although slightly less of the longer chain 
PUFA, dihomo-Y-linolenic acid (20:3), was incorporated into yeast than linoleic 

1 5 acid (1 8:2) when either was added in free form to the induced yeast cultures, y- 

linolenic acid was detected when linoleic acid was present during induction and 
expression of 5. cerevisiae 334 (pCGR-5). The presence of this PUFA 
demonstrates A6-desaturase activity from pCGR-5 (MAS24). Linoleic acid, 
.identified in the extracted lipids from expression of S, cerevisiae 334 (pCGR-7), 

20 classifies the cDNA MA648 from M alpina as the A12-desaturase. 
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Example 6 

Optimization of Culture Conditions 

Table 3 A shows the effect of exogenous free fatty acid substrate 
concentration on yeast uptake and conversion to fatty acid product as a 
S percentage of the total yeast lipid extracted. In all instances, low amounts of 

exogenous substmte (1-10 yM) resulted in low fatty acid substrate uptake and 
product formation. Between 25 and 50 jxM concentration of free fatty acid in 
the growth and induction media gave the highest percentage of fatty acid 
product formed, while the 100 ^iM concentration and subsequent high uptake 

1 0 into yeast appeared to decrease or inhibit the desaturase activity. The amount of 

fatty acid substrate for yeast expressing A12-desaturase was similar under the 
same growth conditions, since the substrate, oleic acid, is an endogenous yeast 
fatty acid. The use of a-linolenic acid as an additional substrate for pCGR-5 
(A6) produced the expected product, stearidonic acid (Table 3A). The feedback 

15 inhibition of high fatty acid substmte concentration was well illustrated when 

the percent conversion rates of the respective fatty acid substrates to their 
respective products were compared in Table 3B. In all cases, 100 |4.M substrate 
concentration in the growth media decreased the percent conversion to product. 
The uptake of a-linolenic was comparable to other PUF As added in free form, 

20 while the A6-desaturase percent conversion, 3.8-1 7.5%, to the product 

stearidonic acid was the lowest of ail the substrates examined (Table 3B). The 
effect of media, such as YPD (rich media) versus minimal media with glucose 
on the conversion rate of A12-desaturase was dramatic. Not only did the 
conversion rate for oleic to linoleic acid drop, (Table 3B) but the percent of 

25 linoleic acid formed also decreased by 1 1% when rich media was used for 

growth and induction of yeast desaturase A12 expression (Table 3A). The 
effect of media composition was also evident when glucose was present in the 
growth media for A6-desaturase, since the percent of substrate uptake was 
decreased at 25 (Table 3A). However, the conversion rate remained the 



-56- 



wo 98/46763 



PCTAJS98/07126 



same and percent product fonned decreased for A6-desatiirase for in the 
presence of glucose. 

Table 3A 

5 Effect of Added Substrate on the Percentage of Incorporated 

Substrate and Product Formed in Yeast Extracts 



Plasmid 
in Yeast 


pCGR-2 
(A15) 


PcGR-5 
(A6) 


pCGR-5 
(A6) 


pCGR-7 
(A12) 


Substrate/product 


18:2 /a- 18:3 


18:2/y-18:3 


a- 18:3/1 8:4 


18:1V18:2 


1 liMsub. 


ND 


0.9/0.7 


ND 


ND 


lO^Msub. 


ND 


4J2/2.4 


10.4/2.2 


ND 


25 sub. 


ND 


11/3.7 


18.2/2.7 


ND 


25 \itA^ sub. 


36.6/7.20 


25.1/10.30 


ND 


6.6/15.80 


50 \iM sub. 


53.1/6.50 


ND 


36.2/3 


10.8/13* 


lOO^Msub. 


60.1/5.70 


62.4/40 


47.7/1.9 


10/24.8 
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Table 3B 

Effect of Substrate Concentration in Media on the Percent Conversion 



of Fatty Acid Substrate to Product in Yeast Extracts 



Plasmid in Yeast 


pCGR-2 


pCGR-5 


pCGR-5 


pCGR-7 




(A15) 


, (A6) 


(A6) 


(A12) 


substrate->product 


18:2>-Kx-18:3 


18:2->Yl8:3 


a-l8:3->l8:4 


18:1»->18:2 


1 )aM sub. 


ND 


43.8 


ND 


ND 


10 ^Msub. 


ND 


36.4 


17.5 


ND 


25 sub. 


ND 


25.2 


12.9 


ND 


25 MMOsub. 


16.40 


29.10 


ND 


70.50 


50 sub. 


10.90 


ND 


7.7 


54.6' 


100 ^M sub. 


8.70 


60 


3.8 


71.3 



0 no glucose in media 
5 * Yeast peptone broth (YPD) 

* 18:1 is an endogenous yeast lipid 
sub. is substrate concentration 
ND (not done) 



1 0 Table 4 shows the amount of &tty acid produced by a recombinant 

desaturase from induced yeast cultures when different amounts of free fatty acid 
substrate were used. Fatty acid weight was determined since the total amount of 
lipid varied dramatically when the growth conditions were changed, such as the 
presence of glucose in the yeast growth and induction media. To better 

1 5 determine the conditions when the recombinant desaturase would produce the 

most PUFA product, the quantity of individual £atty acids were examined. The 
absence of glucose dramatically reduced by three fold the amount of linoleic 
acid produced by recombinant A12-desaturase. For the A12-desaturase the 
amount of total yeast lipid was decreased by almost half in the absence of 

20 glucose. Conversely, the presence of glucose in the yeast growth media for A6- 

desaturase drops the y-linolenic acid produced by almost half, while the total 
amount of yeast lipid produced was not changed by the presence/absence of 
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glucose. This points to a possible role for glucose as a modulator of A6- 
desaturase activity. 



Table 4 

S Fatty Acid Produced in ^g from Yeast Extracts 



Plasmid in Yeast 


pCGR-5 


pCGR-5 


pCGR-7 


(enzyme) 


(A6) 


(A6) 


(A12) 


product 


Y-18:3 


18:4 


18:2* 


I \iM sub. 


1.9 


ND 


ND 


10 sub. 


5.3 


4.4 


ND 


25 ^M sub. 


10.3 


8.7 


115.7 


25 0 sub. 


29.6 


ND 


39 0 



0 no glucose in media 

sub. is substrate concentrattm 

ND (not done) 

10 * 1 8 : U the substrate, is an endogenous yeast lipid 

Example 7 

Distribution of PUFAs in Yeast Lipid Fractions 

Table S illustrates the uptake of free fatty acids and their new products 
formed in yeast lipids as distributed in the major lipid fractions. A total lipid 

1 5 extract was prepared as described above. The lipid extract was separated on 

TLC plates, and the fractions were identified by comparison to standards. The 
bands were collected by scraping, and internal standards were added. The 
fractions were then saponified and methylated as above, and subjected to gas 
chromatography. The gas chromatograph calculated the amount of fatty acid by 

20 comparison to a standard. The phospholipid fraction contained the highest 

amount of substrate and product PUFAs for A6-desaturase activity. It would 
appear that the substrates are accessible in the phospholipid form to the 
desaturases. 



-59- 



wo 98/46763 



PCT/US98/07126 



Table 5 

Fatty Acid Distribution in Various Yeast Lipid Fractions in |ig 



Fatty acid 
fraction 


Piiospiiolipid 


Diglyceride 


Free Fatty 
Acid 


Triglyceride 


Cholesterol 
Ester 


SC (pCGR-5) 
substrate 18:2 


166.6 


6.2 


IS 


18.2 


1S.6 


SC (pCGR.5) 
product Y- 1 8:3 


61.7 


1.6 


42 


5.9 


1.2 



SC = S. cerevisiae (plasmid) 



S Example 8 

Further Culture Optimization and Cocxprcssion of A6 and A12-desaturases 

This experiment was designed to evaluate the growth and induction 
conditions for optimal activities of desaturases in Saccharomyces cerevisiae. A 
Saccharomyces cerevisiae strain (SC334) capable of producing y-linoienic acid 

10 (GLA) was developed, to assess the feasibility of production of PUFA in yeast. 

The genes for A6 and A12-desaturases from M alpina were coexpressed in 
SC334. Expression of A12-desaturase converted oleic acid (present in yeast) to 
linoieic acid. The linoleic acid was used as a substrate by the A6<-desaturase to 
produce GLA. The quantity of GLA produced ranged between 5-8% of the 

1 5 total fatty acids produced in SC334 cultures and the conversion rate of linoleic 

acid to y-linoienic acid ranged between 30% to 50%. The induction temperature 
was optimized, and the effect of changing host strain and upstream promoter 
sequences on expression of A6 and A12 (MA 524 and MA 648 respectively) 
desaturase genes was also determined. 
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Plasmid Construction 

The cloning of pCGRS as well as pCGR7 has been discussed above. To 
construct pCGR9a and pCGR9b, the A6 and Al 2-desaturase genes were 
amplified using the following sets of primers. The primers pRDSl and 3 had 
5 Xhol site and primers pRDS2 and 4 had Xbal site (indicated in bold). These 

primer sequences are presented as SEQ ID NO: 15-18. 

I. A6-desaturase amplification primers 

a, pRDSl TAG CAA CTC GAG AAA ATG GCT GCT GCT CCC 
AGTGTGAGG 

10 b. pRDS2 AAC TGA TCT AGA TTA CTG CGC CTT ACC CAT 

CTT GGA GGC 

II. A 1 2-desaturase amplification primers 

a. pRDS3 TAG CAA CTC GAG AAA ATG GCA CCT CCC 
AAC ACT ATG GAT 

1 5 b. pRDS4 AAC TGA TCT AGA TTA CTT CTT GAA AAA GAC 

CAC GTC TCC 

The pCGRS and pCGR7 constructs were used as template DN A for 
amplification of A6 and A 1 2-desaturase genes, respectively. The amplified 
products were digested with Xbal and Xhol to create "sticky ends". The PGR 

20 amplified A6-desaturase with Xhol-Xbal ends as cloned into pCGR7, which was 

also cut with Xho-l-Xbal. This procedure placed the A6-desaturase behind the 
Al 2-desaturase, under the control of an inducible promoter GALl . This 
construct was designated pCGR9a. Similarly, to construct pCGR9b, the A12- 
desaturase with Xhol-Xbal ends was cloned in the Xhol-Xbal sites of pCGR5. 

25 In pCGR9b the A 1 2-desaturase was behind the A6-desaturase gene, away firom 

the GAL promoter. 

To construct pCGRlO, the vector pRS425, which contains the 
constitutive Glyceraldehyde 3-Phosphate Dehydrogenase (GPD) promoter, was 
digested with BamHl and pCGR5 was digested with BamHl-Xhol to release the 
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A6-desaturase gene. This A6Klesaturase fragment and BamHl cut pRS42S were 
filled using Klenow Polymerase to create blunt ends and ligated, resulting in 
pCGRlOa and pCGRlOb containing the A6-desaturase gene in the sense and 
antisense orientation, respectively. To construct pCGRl 1 and pCGR12, the A6 
S and A12-desaturase genes were isolated from pCGRS and pCGR7, respectively, 

using an EcoRl-XhoI double digest The EcoRl-Xhol fragments of A6 and A12- 
desaturases were cloned into the pYX242 vector digested with EcoRl-XhoL 
The pYX242 vector has the promoter of TPl ( a yeast housekeeping gene), 
which allows constitutive expression. 

10 Yeast Transformation and Expression 

Different combinations of pCGR5, pCGR7, pCGR9a, pCGR9b, 
pCGRlOa, pCGRl 1 and pCGR12 were introduced into various host strains of 
Saccharomyces cerevisiae. Transformation was done using PEG/Li Ac protocol 
(Methods in Enzymology Vol. 194 (1991): 186-187). Transfonnants were 
1 S selected by plating on synthetic media lacking the appropriate amino acid. The 

pCGRS, pCGR7, pCGR9a and pCGR9b can be selected on media lacking 
uracil. The pCGRlO, pCGRl 1 and pCGR12 constructs can be selected on 
media lacking leucine. Growth of cultures and frttty acid analysis was 
performed as in Example 5 above. 

20 Production of GLA 

Production of GLA requires the expression of two enzymes ( the A6 and 
A12-desaturases), which are absent in yeast. To express these enzymes at 
optimiun levels the following constructs or combinations of constmcts, were 
introduced into various host strains: 

25 1) pCGR9a/SC334 

2) pCGR9h/SC334 

3) pCGR10aandpCGR7/SC334 

4) pCGRU andpCGR7/SC334 

5) pCGR12andpCGR5/SC334 
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6) pCGR10aandpCGR7/DBY746 

7) pCGR10aandpCGR7/DBY746 

The pCGR9a construct has both the A6 and A12-desaturase genes under 
the control of an inducible GAL promoter. The SC334 host cells transformed 
5 with this constmct did not show any GLA acciunulation in total fatty acids (Fig. 

6A and B, lane 1). However, when the A6 and A12-desaturase genes were 
individually controlled by the GAL promoter, the control constructs were able 
to express A6- and A12-desaturase, as evidenced by the conversion of their 
respective substrates to products. The A12-desaturase gene in pCGR9a was 
10 expressed as evidenced by the conversion of 18:la>9 to 18:20)6 in 

pCGR9a/SC334, while the A6-desaturase gene was not expressed/active, 
because the 18:2<o6 was not being converted to 18:3(i>6 (Fig. 6A and B, lane 1). 

The pCGR9b construct also had both the A6 and A12-desaturase genes 
under the control of the GAL promoter btit in an inverse order compared to 
15 pCGR9a. In this case, very little GLA (<1%) was seen in pCGR9b/SC334 

cultures. The expression of A12-desaturase was also very low, as evidenced by 
the low percentage of 18:2g>6 in the total fatty acids (Fig. 6A and B, lane 1). 

To test if expressing both enzymes under the control of independent 
promoters would increase GLA production, the A6-desaturase gene was cloned 

20 into the pRS425 vector. The construct of pCGRlOa has the A6-desaturase in the 

correct orientation, under control of constitutive GPD promoter. The pCGRlOb 
has the A6-desaturase gene in the inverse orientation, and serves as the negative 
control. The pCGR10a/SC334 cells produced significantly higher levels of 
GLA (5% of the total fatty acids. Fig. 6, lane 3), compared to pCGR9a. Both 

25 the A6 and AI2-desatuiase genes were expressed at high level because the 

conversion of 18:l<i)9-> 18:2(o6 was 65%, while the conversion of 18:2o>6 — ► 
1 8:3a)6 (A6-desaturase) was 30% (Fig. 6, lane 3). As expected, the negative 
control pCGR10b/SC334 did not show any GLA. 

To further optimize GLA production, the A6 and A12 genes were 
30 introduced into the pYX242 vector, creating pCGRl 1 and pCGRl 2 
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respectively. The pYX242 vector allows for constitutive expression by the TPl 
promoter (Alber, T. and Kawasaki, G. (1982). J, Mol & Appl Genetics 1: 
419). The introduction of pCGRl 1 and pCGR7 in SC334 resulted in 
approximately 8% of GLA in total fatty acids of SC334. The rate of conversion 
5 of 1 8: 1 a)9-> 1 8:2a>6 and 1 8:2ci>6 1 8:3Ci>6 was approximately 50% and 44% 

respectively (Fig, 6A and B, lane 4). The presence of pCGR12 and pCGRS in 
SC334 resulted in 6.6% GLA in total fatty acids with a conversion rate of 
approximately 50% for both 18:l€i)9 to 18:2co6 and 18:2o>6 to 18:3ci>6, 
respectively (Fig. 6A and B, lane 5). Thus although the quantity of GLA in 
1 0 total fatty acids was higher in the pCGRl l/pCGR7 combination of constructs, 
the conversion rates of substrate to product were better for the pCGR12/pCGR5 
combination. 

To determine if changing host strain would increase GLA production, 
pCGRlOa and pCGR7 were introduced into the host strain BJ1995 and 

15 DBY746 (obtained from the Yeast Genetic Stock Centre, 1021 Donner 

Laboratory, Berkeley, CA 94720. The genotype of strain DBY746 is Mata, 
his3-Al, leu2-3, leu2-l 12, ura3-32, trpl-289, gal). The results are shown in Fig. 
7. Changing host strain to BJ1995 did not improve the GLA production, 
because the quantity of GLA was only 1 .3 1% of total fatty acids and the 

20 conversion rate of 1 8:loi>9 18:2co6 was approximately 17% in BJ1995. No 
GLA was observed in DBY746 and the conversion of 1 8: la)9 1 8:20)6 was 
very low (<1% in control) suggesting that a cofactor required for the expression 
of A12-desaturase might be missing in DB746 (Fig. 7, lane 2). 

To determine the effect of temperature on GLA production, SC334 
25 cultures containing pCGRlOa and pCGR7 were grown at 1 5**C and 30**C. 

Higher levels of GLA were foimd in cultures grown and induced at 15^C than 
those in cultures grown at 30**C (4.23% vs. 1.68%). This was due to a lower 
conversion rate of 18:2ci>6 18:306 at 30^C (11.6% vs. 29% in 15*C) cultures, 
despite a higher conversion of 18:lco9 18:2€o6 (65% vs. 60% at 30''C (Fig. 
30 8). These results suggest that A12- and A6-desaturases may have different 

optimal expression temperatures. 
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Of the various parameters examined in this study, temperature of 
growth, yeast host strain and media components had the most significant impact 
on the expression of desaturase, while timing of substmte addition and 
concentration of inducer did not significantly affect desaturase expression. 

These data show that two DNAs encoding desaturases that can convert 
LA to GLA or oleic acid to LA can be isolated from Mortierella alpina and can 
be expressed, either individually or in combination, in a heterologous system 
and used to produce poly*unsaturated long chain fatty acids. Exemplified is the 
production of GLA fix>m oleic acid by expression of A12- and A6-desaturases in 
yeast. 

Example 9 

Identification of Homolof^es to M. alpina AS and A6 desaturases 

A nucleic acid sequence that encodes a putative A5 desaturase was 
identified through a TBLASTN search of the expressed sequence tag databases 
through NCBI using amino acids 100-446 of Ma29 as a query. The truncated 
portion of the Ma29 sequence was used to avoid picking up homologies based 
on the cytochrome bS portion at the N-temiinus of the desaturase. The deduced 
amino acid sequence of an est fsom Dictyostelium discoideum (accession U 
C2SS49) shows very significant homology to Ma29 and lesser, but still 
significant homology to MaS24. The DNA sequence is presented as SEQ ID 
NO: 19. The amino acid sequence is presented as SEQ ID NO:20. 

Example 10 

Identification of Af« alpina AS and A6 homologues in other 
PUFA-producing organisms 

To look for desaturases involved in PUFA production, a cDNA library 
was constructed fi^om total RNA isolated from Phaeodactylum tricornutum. A 
plasmid-based cDNA librariy was constructed in pSPORTl (GIBCO-BRL) 
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following manufacturer's instructions using a commercially available kit 
(GIBCO-BRL). Random cDNA clones were sequenced and nucleic acid 
sequences that encode putative A5 or A6 desaturases were identified through 
BLAST search of the databases and comparison to Ma29 and MaS24 sequences. 

S One clone was identified firom the Phaeodactylum library with 

homology to Ma29 and MaS24; it is called 144-01 1-B12. The DNA sequence is 
presented as SEQ ID NO:21 . The amino acid sequence is presented as SEQ ID 
NO:22. 

Example 11 

10 Identification of M. alpina AS and A6 homolosues in other 

PUFA-oroducing organisms 

To look for desaturases involved in PUFA production, a cDNA library 
was constructed fiom total RNA isolated from Schizochytrium species. A 
plasmid-based cDNA library was constructed in pSPORTl (GIBCO-BRL) 
1 S following manufacturer's instructions using a commercially available kit 

(GIBCO-BRL). Random cDNA clones were sequenced and nucleic acid 
sequences that encode putative AS or A6 desaturases were identified through 
BLAST search of the databases and comparison to Ma29 and MaS24 sequences. 

One clone was identified from the Schizochytriim library with 
20 homology to Ma29 and MaS24; it is called 81-23-C7. This clone contains a -1 
kb insert. Partial sequence was obtained from each end of the clone using the 
universal forward and reverse sequencing primers. The DNA sequence fix>m 
the forward primer is presented as SEQ ID NO:23. The peptide sequence is 
presented as SEQ ID NO:24. The DNA sequence fcom the reverse primer is 
25 presented as SEQ ID NO:2S. The amino acid sequence from the reverse primer 

is presented as SEQ ID NO:26. 
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Example 12 

Human Dcsaturase Ccne Scouenccs 

Human desaturase gene sequences potentially involved in long chain 
polyunsaturated fatty acid biosynthesis were isolated based on homology 
5 between the human cDNA sequences and Mortierella alpina desaturase gene 

sequences. The three conserved "histidine boxes" known to be conserved 
among membrane-boimd desaturases were found. As with some other 
membrane-boimd desaturases the final HXXHH histidine box motif was foimd 
to be QXXHH. The amino acid sequence of the putative human desaturases 
10 exhibited homology to M alpina AS, A6, A9, and A 12 desaturases. 

The M alpina AS desaturase and A6 desaturase cDNA sequences were 
used to search the LifeSeq database of Incyte Pharmaceuticals, Inc., Palo Alto, 
California 94304. The AS desaturase sequence was divided into fragments; 1) 
amino acid no. 1-lSO, 2) amino acid no. lSl-300, and 3) amino acid no. 301- 

IS 446. The A6 desaturase sequence was divided into three fragments; 1) amino 

acid no. 1-lSO, 2) amino acid no. 151-300, and 3) amino acid no. 301-4S7. 
These polypeptide fragments were searched against the database using the 
"tblastn" algorithm. This alogarithm compares a protein query sequence against 
a nucleotide sequence database dynamically translated in all six reading frames 

20 (both strands). 

The polypeptide fragments 2 and 3 of M. alpina AS and A6 have 
homologies with the ClonelD sequences as outlined in Table 6. The ClonelD 
represents an individual sequence from the Incyte LifeSeq database. After the 
"tblastn" results have been reviewed. Clone Information was searched with the 

2S defruilt settings of Stringency of >=50, and Productscore <=1 00 for different 

ClonelD numbers. The Clone Information Results displayed the information 
including the ClusterlD, ClonelD, Library, HitID, Hit Description. When 
selected, the ClusterlD number displayed the clone information of all the clones 
that belong in that ClusterlD. The Assemble command assembles all of the 

30 ClonelD which comprise the ClusterlD. The following default settings were 
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used for GCG (Genetics Computer Group, University of Wisconsin 
Biotechnology Center, Madison, Wisconsin S370S) Assembly: 

Word Size: 7 

S Minimum Overlap: 14 

Stringency: 0.8 

Minimum Identity : 1 4 

Maximum Gap: 10 

Gap Weight: 8 

10 Length Weight: 2 

GCG Assembly Results displayed the contigs generated on the basis of 
sequence information within the ClonelD. A contig is an alignment of DNA 
sequences based on areas of homology among these sequences. A new 

IS sequence (consensus sequence) was generated based on the aligned DNA 

sequences within a contig. The contig containing the ClonelD was identified, 
and the ambiguous sites of the consensus sequence was edited based on the 
alignment of the ClonelDs (see SEQ ID NO:27 - SEQ ID NO:32) to generate 
the best possible sequence. The procedure was repeated for all six ClonelD 

20 listed in Table 6. This produced five unique contigs. The edited consensus 

sequences of the S contigs were imported into the Sequencher software program 
(Gene Codes Corporation, Ann Arbor, Michigan 48 105). These consensus 
sequences were assembled. The contig 25 1 1785 overlaps with contig 3506132, 
and this new contig was called 2535 (SEQ ID NO:33). The contigs from the 

25 Sequencher program were copied into the Sequence Analysis software package 

ofGCG. 

Each contig was translated in all six reading frames into protein 
sequences. The A/, alpina AS (MA29) and A6 (MA524) sequences were 
compared with each of the translated contigs using the FastA search (a Pearson 
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and Lipman search for similarity between a query sequence and a group of 
sequences of the same type (nucleic acid or protein)). Homology among these 
sequences suggest the open reading frames of each contig. The homology 
among the M alpina AS and A6 to contigs 2S3S and 38S4933 were utilized to 
5 create the final contig called 2S3S38a. Figure 1 3 is the FastA match of the final 

contig 2S3S38a and MA29, and Figure 14 is the FastA match of the final contig 
2S3S38a and MAS24. The DNA sequences for the various contigs are 
presented in SEQ ID NO:27 -SEQ ID NO:33 The various peptide sequences 
aie shown m SEQ ID NO:34 - SEQ ID NO: 40. 

1 0 Although the open reading frame was generated by merging the two 

contigs, the contig 2S3S shows that there is a tmique sequence in the beginning 
of this contig which does not match with the contig 38S4933. Therefore, it is 
possible that these contigs were generated from independent desaturase like 
human genes. 

15 The contig 253538a contains an open reading frame encoding 432 

amino acids. It starts with Gin (CAG) and ends with the stop codon (TGA). 
The contig 253538a aligns with both M alpina A5 and A6 sequences, 
suggesting that it could be either of the desaturases, as well as other known 
desaturases which share homology with each other. The individual contigs 

20 listed in Table 18, as well as the intermediate contig 2535 and the final contig 

253538a can be utilized to isolate the complete genes for himian desaturases. 

Uses of the human desaturases 

These human sequences can be express in yeast and plants utilizing the 
procedures described in the preceding examples. For expression in mammalian 
25 cells transgenic animals, these genes may provide superior codon bias. 

In addition, these sequences can be used to isolate related desaturase 
genes from other organisms. 



Table 6 



Sections f the 


Clone ID fr m LifeSeq Database 


Keyw rd 


Desaturases 
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151-300 A5 


3808675 


faxty acid desaturase 


301-446 A5 


354535 


A6 


151-300 A6 


3448789 


A6 


151-300 A6 


1362863 


A6 


151-300 A6 


2394760 


A6 


301^57 A6 


3350263 


A6 



Example 13 

I. INFANT FORMULATIONS 

A. Isomil® Soy Formula with Iron. 

5 Usage: As a beverage for infants, children and adults with an allergy or 

sensitivity to cow*s milk. A feeding for patients with disorders for which 
lactose should be avoided: lactase deficiency, lactose intolerance and 
galactosemia. 

Features: 

10 • Soy protein isolate to avoid symptoms of cow's-milk-protein 

allergy or sensitivity 

• Lactose-free formulation to avoid lactose-associated diarrhea 

• Low osmolaity (240 mOsm/kg water) to reduce risk of osmotic 
diarrhea. 

IS • Dual carbohydrates (com symp and sucrose) designed to 

enhance carbohydrate absorption and reduce the risk of exceeding the 
absorptive capacity of the damaged gut 

• 1.8 mg of Iron (as ferrous sulfate) per 100 Calories to help 
prevent iron deficiency. 

20 • Recommended levels of vitamins and minerals. 

• Vegetable oils to provide recommended levels of essential fatty 
acids. 

• Milk-white color, milk-like consistency and pleasant aroma. 
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Ingredients: (Pareve, ®) 85% water, 4.9% com syrup, 2.6% sugar 
(sucrose), 2.1% soy oil, 1.9% soy protein isolate, 1.4% coconut oil, 0.15% 
calcium citrate, 0.1 1 % calcium phosphate tribasic, potassiiun citrate, potassium 
phosphate monobasic, potassium chloride, mono- and disglycerides, soy 
5 lecithin, carrageenan, ascorbic acid, L-methionine, magnesium chloride, 

potassium phosphate dibasic, sodium chloride, choline chloride, taurine, ferrous 
sulfate, m-inositol, alpha-tocopheryl acetate, zinc sulfate, L-camitine, 
niacinamide, calcium pantothenate, cupric sulfate, vitamin A palmitate, 
thiamine chloride hydrochloride, riboflavin, pyridoxine hydrochloride, folic 
10 acid, manganese sul&te, potassiimi iodide, phylloquinone, biotin, sodium 

selenite, vitamin D3 and cyanocobalamin. 

B. Isomil® DF Soy Formula For Diarrhea. 

Usage: As a short-term feeding for the dietary management of diarrhea 
in infants and toddlers. 

15 Features: 

• First infant formula to contain added dietary fiber from soy fiber 
specifically for diarrhea management 

• Clinically sho%vn to reduce the duration of loose, watery stools 
during mild to severe diarrhea in infants. 

20 • Nutritionally complete to meet the nutritional needs of the infant. 

• Soy protein isolate with added L-methionine meets or exceeds an 
infant's requirement for all essential amino acids. 

• Lactose-fi«e formulation to avoid lactose-associated diarrhea. 

• Low osmolality (240 mOsm/kg water) to reduce the risk of 
25 osmotic diarrhea. 

• Dual carbohydrates (com syrup and sucrose) designed to 
enhance carbohydrate absorption and reduce the risk of exceeding the 
absorptive capacity of the damaged gut. 
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• Meets or exceeds the vitamin and mineral levels recommended 
by the Committee on Nutrition of the American Academy of Pediatrics 
and required by the Infant Formula Act. 

• 1.8 mg of iron (as ferrous sulfate) per 100 Calories to help 
S prevent iron deficiency. 

• Vegetable oils to provide recommended levels of essential fatty 
acids. 

Ingredients: (Pareve, ®) 86% water, 4.8% com syrup, 2.5% sugar 
(sucrose), 2.1% soy oil, 2.0% soy protein isolate, 1.4% coconut oil, 0.77% soy 

1 0 fiber, 0. 1 2% calcium citrate, 0. 1 1 % calcium phosphate tribasic, 0. 1 0% 

potassium citrate, potassium chloride, potassium phosphate monobasic, mono- 
and disglycerides, soy lecithin, carrageenan, magnesium chloride, ascorbic acid, 
L-methionine, potassium phosphate dibasic, sodium chloride, choline chloride, 
taurine, ferrous sulfate, m-inositol, alpha-tocopheryl acetate, zinc sulfate, L- 

1 S carnitine, niacinamide, calcium pantothenate, cupric sul&te, vitamin A 

palmitate, thiamine chloride hydrochloride, riboflavin, pyridoxine 
hydrochloride, folic acid, max^anese sulfate, potassiimi iodide, phylloquinone, 
biotin, sodium selenite, vitamin D3 and cyanocobalamin. 

C. IsomU® SF Sucrose-Free Soy Formula With Iron. 

20 Usage: As a beverage for infants, children and adults with an allergy or 

. sensitivity to cow's-milk protein or an intolerance to sucrose. A feeding for 
patients with disorders for which lactose and sucrose should be avoided. 

Features: 

• Soy protein isolate to avoid symptoms of cow's-milk-protein 
25 allergy or sensitivity. 

• Lactose-fi^e formulation to avoid lactose-associated diarriiea 
(carbohydrate source is Polycose® Glucose Polymers). 

• Sucrose free for the patient who cannot tolerate sucrose. 
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• Low osmolality (180 mOsm/kg water) to reduce risk of osmotic 
dianiiea. 

• 1.8 mg of iron (as ferrous sulfate) per 100 Calories to help 
prevent iron deficiency. 

S • Recommended levels of vitamins and minerals. 

• Vegetable oils to provide recommended levels of essential fatty 
acids. 

• Milk-v4iite color, milk-like consistency and pleasant aroma. 

Ingredients: (Pareve, ®) 75% water, 1 1.8% hydroiized cornstarch, 4.1% 
10 soy oil, 4.1% soy protein isolate, 2.8% coconut oil, 1.0% modified cornstarch, 

0.38% calcium phosphate tribasic, 0.17% potassium citrate, 0.13% potassium 
chloride, mono- and disglycerides, soy lecithin, magnesium chloride, abscorbic 
acid, L-methionine, calcium carbonate, sodium chloride, choline chloride; 
carrageenan, taurine, ferrous sulfate, m-inositol, alpha-tocopheryl acetate, zinc 
IS sulfate, L-camitine, niacinamide, calcium pantothenate, cupric sulfate, vitamin 

A palmitate, thiamine chloride hydrochloride, riboflavin, pyridoxine 
hydrochloride, folic acid, manganese sulfate, potassium iodide, phylloquinone, 
biotin, sodium selenite, vitamin D3 and cyanocobalamin. 

D. Isomil® 20 Soy Formula With Iron Ready To Feed, 
20 20 Cayil oz. 

Usage: When a soy feeding is desired. 

Ingredients: (Pareve, ®) 85% water, 4.9% com syrup, 2.6% sugar 
(sucrose), 2.1% soy oil, 1.9% soy protein isolate, 1.4% coconut oil, 0.15% 
calcium citrate, 0.1 1% calcium phosphate tribasic, potassium citrate, potassium 

25 phosphate monobasic, potassium chloride, mono- and disglycerides, soy 

lecithin, carrageenan, abscorbic acid, L-methionine, magnesiimi chloride, 
potassium phosphate dibasic, sodium chloride, choline chloride, taurine, ferrous 
sulfate, m-inositol, alpha-tocopheryl acetate, zinc sulfate, L-camitine, 
niacinamide, calcium pantothenate, cupric sulfate, vitamin A palmitate, 

30 thiamine chloride hydrochloride, riboflavin, pyridoxine hydrochloride, folic 
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acid» manganese sulfate, potassium iodide, phylloquinone, biotin, sodium 
selenite, vitamin Da and cyanocobaiamin. 

E. Similac® Infant Formula 

Usage: When an infant fomnila is needed: if the decision is made to 
5 discontinue breastfeeding before age 1 year, if a supplement to breastfeeding is 

needed or as a routine feeding if breastfeeding is not adopted. 

Features: 

• Protein of appropriate quality and quantity for good growth; 
heat-denatured, which reduces the risk of milk-associated enteric blood 

10 loss. 

• Fat from a blend of vegetable oils (doubly homogenized), 
providing essential linoleic acid that is easily absorbed. 

• Carbohydrate as lactose in proportion similar to that of human 
milk. 

IS • Low renal solute load to minimize stress on developing organs. 

• Powder, Concentrated Liquid and Ready To Feed forms. 

Ingredients: (®-D) Water, nonfat milk, lactose, soy oil, coconut oil, 
mono- and diglycerides, soy lecithin, abscorbic acid, carrageenan, choline 
chloride, taurine, m-inositol, alpha-tocopheryl acetate, zinc sulfate, niacinamid, 
20 ferrous sulfate, calcium pantothenate, cupric sulfate, vitamin A palmitate, 

thiamine chloride hydrochloride, riboflavin, pyridoxine hydrochloride, folic 
acid, manganese sulfate, phylloquinone, biotin, sodium selenite, vitamin D3 and 
cyanocobalamin. 

F. Similac® NeoCare Premature Infant Formula With Iron 

25 Usage: For premature in&nts* special nutritional needs after hospital 

discharge. Similac NeoCare is a nutritionally complete formula developed to 
provide premature infants with extra calories, protein, vitamins and minerals 
needed to promote catch-up growth and support development. 

Features: 
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• Reduces the need for caloric and vitamin supplementation. More 
calories (22 Cal/fl oz) then standard term fomiulas (20 Cal/fl oz). 

• Highly absori^ed fat blend, with medium-chain triglycerides 
(MCT oil) to help meet the special digestive needs of premature infants. 

5 • Higher levels of protein, vitamins and minerals per 100 Calories 

to extend the nutritional support initiated in-hospital. 

• More calcium and phosphorus for improved bone mineralization. 

Ingredients: ®-D Com symp solids, nonfat milk, lactose, vAiey protein 
concentrate, soy oil, high-oleic safiflower oil, fractionated coconut oil (medium- 

10 chain triglycerides), coconut oil, potassium citrate, calcium phosphate tribasic, 
calcium carbonate, ascorbic acid, magnesium chloride, potassium chloride, 
sodium chloride, taurine, ferrous sulfate, m-inositol, choline chloride, ascorbyl 
palmitate, L-camitine, alpha-tocopheryl acetate, zinc sulfate, niacinamide, 
mixed tocopherols, sodium citrate, calcium pantothenate, cupric sulfate, 

1 5 thiamine chloride hydrochloride, vitamin A palmitate, beta carotene, riboflavin, 

pyridoxine hydrochloride, folic acid, manganese sulfate, phylloquinone, biotin, 
sodium selenite, vitamin D3 and cyanocobalamin. 

G. Similac Natural Care Low-Iron Human Milk Fortifier Ready 
To Use, 24 Cal/fl oz. 

20 Usage: Designed to be mixed with human milk or to be fed alternatively 

with hiunan milk to low-birth-weight infants. 

Ingredients: <^-D Water, nonfat milk, hydrolyzed cornstarch, lactose, 
fractionated coconut oil (medium-chain triglycerides), whey protein 
concentrate, soil oil, coconut oil, calcium phosphate tribasic, potassium citrate, 
25 magnesium chloride, sodiimi citrate, ascorbic acid, calcium carbonate, mono- 

and diglycerides, soy lecithin, carrageenan, choline chloride, m-inositol, taurine, 
niacinamide, L-camitine, alpha tocopheryl acetate, zinc sulfate, potassiimi 
chloride, calcium pantothenate, ferrous sulfate, cupric sulfate, riboflavin, 
vitamin A palmitate, thiamine chloride hydrochloride, pyridoxine 
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hydrochloride, biotin, folic acid, manganese sulfate, phylloquinone, vitamin D3, 
sodium selenite and cyanocobalamin. 

Various PUF As of this invention can be substituted and/or added to the 
infant formulae described above and to other infant formulae known to those in 
S the art.. 

IL NUTRITIONAL FORMULATIONS 
A. ENSURE® 

Usage: ENSURE is a low-residue liquid food designed primarily as an 
oral nutritional supplement to be used with or between meals or, in appropriate 
1 0 amounts, as a meal replacement. ENSURE is lactose- and gluten-free, and is 

suitable for use in modified diets, including low-cholesterol diets. Although it 
is primarily an oral supplement, it can be fed by tube. 

Patient Conditions: 

• For patients on modified diets 

IS • For elderly patients at nutrition risk 

• For patients with involuntary weight loss 

• For patients recovering from ilkiess or surgery 

• For patients who need a low-residue diet 
Ingredients: 

20 ®-D Water, Sugar (Sucrose), Maltodextrin (Com), Calcium and Sodium 

Caseinates, High-Oleic Safflower Oil, Soy Protein Isolate, Soy Oil, Canola Oil, 
Potassiimi Citrate, Calcium Phosphate Tribasic, Sodium Citmte, Magnesium 
Chloride, Magnesium Phosphate Dibasic, Artificial Flavor, Sodium Chloride, 
Soy Lecithin, Choline Chloride, Ascorbic Acid, Carrageenan, Zinc Sulfate, 

25 Ferrous Sulfate, Alpha-Tocopheryl Acetate, Gellan Gum, Niacinamide, 

Calciimi Pantothenate, Manganese Sulfate, Cupric Sulfate, Vitamin A 
Palmitate, Thiamine Chloride Hydrochloride, Pyridoxine Hydrochloride, 
Riboflavin, Folic Acid, Sodium Molybdate, Chromium Chloride, Biotin, 
Potassium Iodide, Sodium Selenate. 
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B. ENSURE® BARS 

Usage: ENSURE BARS are complete, balanced nutrition for 
supplemental use between or with meals. They provide a delicious, nutrient- 
5 rich alternative to other snacks. ENSURE BARS contain <1 g lactose/bar, and 

Chocolate Fudge Brownie flavor is gluten-free. (Honey Graham Crunch flavor 
contains gluten.) 

Patient Conditions: 

• For patients who need extra calories, protein, vitamins and minerals 

10 • Especially useful for people who do not take in enough calories and 

nutrients 

• For people who have the ability to chew and swallow 

• Not to be used by anyone with a peanut allergy or any type of allergy to 
nuts. 

IS Ingredients: 

Honey Graham Crunch — High-Fructose Com Syrup, Soy Protein 
Isolate, Brown Sugar, Honey, Maltodextrin (Com), Crisp Rice (Milled Rice, 
Sugar [Sucrose], Salt [Sodium Chloride] and Malt), Oat Bran, Partially 
Hydrogenated Cottonseed and Soy Oils, Soy Polysaccharide, Glycerine, Whey 
20 Protein Concentrate, Polydextrose, Fmctose, Calcium Caseinate, Cocoa 

Powder, Artificial Flafors, Canola Oil, High-Oleic Safflower Oil, Nonfat Dry 
Milk, Whey Powder, Soy Lecithin and Com Oil. Manufactured in a facility that 
processes nuts. 

Vitamins and Minerals: 

25 Calcium Phosphate Tribasic, Potassium Phosphate Dibasic, Magnesium 

Oxide, Salt (Sodium Chloride), Potassium Chloride, Ascorbic Acid, Ferric 
Orthophosphate, Alpha-Tocopheiyl Acetate, Niacinamide, Zinc Oxide, Calcium 
Pantothenate, Copper Gluconate, Manganese SuI£Ette, Riboflavin, Beta- 
Carotene, Pyridoxine Hydrochloride, Thiamine Mononitrate, Folic Acid, Biotin, 
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Chromium Chloride, Potassium Iodide, Sodiimi Selenate, Sodium Molybdate, 
Phylloquinone, Vitamin D3 and Cyanocobalamin. 

Protein: 

Honey Graham Crunch - The protein source is a blend of soy protein isolate 
5 and milk proteins. 

Soy protein isolate 74% 
Milk proteins 26% 

Fat: 

Honey Graham Crunch - The fat source is a blend of partially 
10 hydrogenated cottonseed and soybean, canola, high oleic safiflower, and com 

oils, and soy lecithin. 

Partially hydrogenated cottonseed and soybean oil 76% 



Canola oil 8% 

High-oleic safQower oil 8% 

15 Com oil 4% 

Soy lecithin 4% 

Carbohydrate: 



Honey Graham Crunch - The carbohydrate source is a combination of 
high-fructose com symp, brown sugar, maltodextrin, honey, crisp rice, 
20 glycerine, soy polysaccharide, and oat bran. 



High-fructose com symp 24% 

Brown sugar 2 1 % 

Maltodextrin 12% 

Honey 11% 

25 Crisp rice 9% 

Glycerine 9% 

Soy polysaccharide 7% 

Oat bran 7%\ 
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C. ENSURE® HIGH PROTEIN 

Usage: ENSURE HIGH PROTEIN is a concentrated, high-protein 
liquid food designed for people who reqxiire additional calories, protein, 
vitamins, and minerals in their diets. It can be used as an oral nutritional 
S supplement with or between meals or, in appropriate amounts, as a meal 

replacement. ENSURE HIGH PROTEIN is lactose- and gluten-free, and is 
suitable for use by people recovering from general surgery or hip fractures and 
by patients at risk for pressure ulcm. 

Patient Conditions 

10 • For patients who require additional calories, protein, vitamins, and minerals, 

such as patients recovering from general surgery or hip fractures, patients at risk 
for pressure ulcers, and patients on low-cholesterol diets 

Features- 

• Low in saturated fat 

IS • Contains 6 g of total fat and < S mg of cholesterol per serving 

• Rich, creamy taste 

• Excellent source of protein, calcium, and other essential vitamins and 
minerals 

• For low-cholesterol diets 
20 • Lactose-free, easily digested 

Ingredients: 

Vanilla Supreme: -®-d Water, Sugar (Sucrose), Maltodextrin (Com), Calcium 
and Sodium Caseinates, High-Oleic Safflower Oil, Soy Protein Isolate, Soy Oil, 
Canola Oil, Potassium Citrate, Calcium Phosphate Tribasic, Sodium Citmte, 
25 Magnesivim Chloride, Magnesiimi Phosphate Dibasic, Artificial Flavor, Sodium 

Chloride, Soy Lecithin, Choline Chloride, Ascorbic Acid, Carrageenan, Zinc 
Sulfate, Ferrous Sufifate, Alpha-Tocopheryl Acetate, Gellan Gum, Niacinamide, 
Calcium Pantothenate, Manganese Sulfate, Cupric Sulfate, Vitamin A 
Palmitate, Thiamine Chloride Hydrochloride, Pyridoxine Hydrochloride, 
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Riboflavin, Folio Acid, Sodium Motybdate, Chromium Chloride, Biotin, 
Potassium Iodide, Sodium Seienate, Phylloquinone, Vitamin D.3 and 
Cyanocobalamin. 

Protein: 

5 The protein source is a blend of two high^-biologic-value proteins: casein and 

soy. 

Sodium and calcium caseinates 85% 
Soy protein isolate 1 5% 

Fat: 

10 The fat source is a blend of three oils: high-oleic safflower, canola, and soy. 

High-oleic safflower oil 40% 

Canola oil 30% 

Soy oil 30% 

The level of fat in ENSURE HIGH PROTEIN meets American Heart 
1 5 Association (AHA) guidelines. The 6 grams of fat in ENSURE HIGH 

PROTEIN represent 24% of the total calories, with 2.6% of the fat being from 
saturated fatty acids and 7.9% from polyunsaturated fatty acids. These values 
are within the AHA guidelines of < 30% of total calories from fat, < 1 0% of the 
calories from saturated fatty acids, and < 1 0% of total calories from 
20 polyunsaturated fatty acids. 

Carbohydrate: 

ENSURE HIGH PROTEIN contains a combmation of maltodextrin and 
sucrose. The mild sweetness and flavor variety (vanilla supreme, chocolate 
royal, wild berry, and banana), plus VARI-FLAVORSO® Flavor Pacs in pecan, 
25 cherry, strawberry, lemon, and orange, help to prevent flavor fatigue and aid in 

patient compliance. 

Vanilla and other nonchocolate flavors 

Sucrose 60% 
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M altodextrin 40% 
Chocolate 

Sucrose 70% 
Maltodextrin 30% 

5 

D. ENSURE ® LIGHT 

Usage: ENSURE LIGHT is a low-fat liquid food designed for use as an 
oral nutritional supplement with or between meals. ENSURE LIGHT is 
lactose- and gluten-free, and is suitable for use in modified diets, including low- 
10 cholesterol diets. 

Patient Conditions: 

• For normal-weight or overweight patients vAio need extra nutrition in a 
supplement that contains 50% less fat and 20% "fewer calories than ENSURE 

• For healthy adults who don*t eat right and need extra nutrition 
IS Features: 

• Low in fat and saturated fat 

• Contains 3 g of total fat per serving and < 5 mg cholesterol 

• Rich, creamy taste 

• Excellent source of calcium and other essential vitamins and minerals 
20 • For low-cholesterol diets 

• Lactose-free, easily digested 
Ingredients: 

French Vanilla: ®-D Water, Maltodextrin (Com), Sugar (Sucrose), Calcium 
Caseinate, High-Oleic SafQower Oil, Canola Oil, Magnesium Chloride, Sodium 
25 Citrate, Potassium Citrate, Potassium Phosphate Dibasic, Magnesium Phosphate 

Dibasic, Natural and Artificial Flavor, Calcium Phosphate Tribasic, Cellulose 
Gel, Choline Chloride, Soy Lecithin, Carrageenan, Salt (Sodium Chloride), 
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Ascorbic Acid. Cellulose Gum, Ferrous Sulfate, Alpha-Tocopheiyl Acetate, 
Zinc Sulfate, Niacinamide, Manganese Sulfate, Calcium Pantothenate, Cupric 
Sulfate, Thiamine Chloride Hydrochloride, Vitamin A Palmitate, Pyridoxine 
Hydrochloride, Riboflavin, Chromium Chloride, Folic Acid, Sodium 
5 Molybdate, Biotin, Potassium Iodide, Sodium Selenate, Phylloquinone, Vitamin 

D3 and Cyanocobalamin. 

Protein: 

The protein source is calcium caseinate. 

Calcium caseinate 1 00% 

10 Fat 

The fat source is a blend of two oils: high-oleic safflower and canola. 

High-oleic safflower oil 70% 

Canola oil 30% 

The level of fat in ENSURE LIGHT meets American Heart Association 
15 (AHA) guidelines. The 3 grams of fat m ENSURE LIGHT represent 13.5% of 

.the total calories, with 1 .4% of the fat being from saturated fatty acids and 2.6% 
from polyunsaturated fatty acids. These values are within the AHA guidelines 
of < 30% of total calories from fat, < 1 0% of the calories from saturated fritty 
acids, and < 1 0% of total calories from polyunsaturated fatty acids. 

20 Carbohydrate 

ENSURE LIGHT contains a combination of maltodextrin and sucrose. 
The chocolate flavor contains com syrup as well. The mild sweetness and 
flavor variety (French vanilla, chocolate supreme, strawberry swirl), plus 
VARI-FL AVORS® Flavor Pacs in pecan, cherry, strawberry, lemon, and 
25 orange, help to prevent flavor fatigue and aid in patient compliance. 

Vanilla and other nonchocolate flavors 

Sucrose 51% 
Maltodextrin 49% 
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Chocolate 

Sucrose 47.0% 

Com Syrup 26.5% 

Maltodextrin 26.5% 

5 Vitamins and Minerak 

An 8-fl-oz serving of ENSURE LIGHT provides at least 25% of the 
RDIs for 24 key vitamins and minerals. 

Caffeine 

Chocolate flavor contains 2.1 mg caffeine/8 fi oz. 

10 

E. ENSURE PLUS® 

Usage: ENSURE PLUS is a high-calorie, low-residue liquid food for 
use when extra calories and nutrients, but a normal concentration of protein, are 
needed. It is designed primarily as an oral nutritional supplement to be used 
1 5 with or between meals or, in appropriate amounts, as a meal replacement. 

ENSURE PLUS is lactose- and gluten-free. Although it is primarily an oral 
nutritional supplement, it can be fed by tube. 

Patient Conditions: 

• For patients who require extra calories and nutrients, but a normal 
20 concentration of protein, in a limited volume 

• For patients who need to gain or maintain healthy weight 
Features 

• Rich, creamy taste 

• Good source of essential vitamins and minerals 

25 Ingredients 

VaniUa: ®.D Water, Com Symp, Maltodextrin (Com), Com Oil, Sodium and 
Calcium Caseinates, Sugar (Sucrose), Soy Protein Isolate, Magnesium Chloride, 
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Potassium Citrate, Calcium Phosphate Tribasic, Soy Lecithin, Natural and 
Artificial Flavor, Sodium Citrate, Potassium Chloride, Choline Chloride, 
Ascorbic Acid, Carrageenan, Zinc Sul&te, Ferrous Sulfate, Alpha-Tocopheryl 
Acetate, Niacinamide, Calcium Pantothenate, Manganese Sulfate, Cupric 
S Sulfate, Thiamine Chloride Hydrochloride, Pyridoxine Hydrochloride, 

Riboflavin, Vitamin A Palmitate, Folic Acid, Biotin, Chromium Chloride, 
Sodium Molybdate, Potassium Iodide, Sodiiim Selenite, Phylloquinone, 
Cyanocobalamin and Vitamin D3. 

Protein 

10 The protein source is a blend of two high-biologic-value proteins: casein 

and soy. 

Sodium and calcium caseinates 84% 
Soy protein isolate 1 6% 

Fat 

1 S The fat source is com oil. 

Com oil 100% 
Carbohydrate 

ENSURE PLUS contains a combination of maltodextrin and sucrose. 
The mild sweetness and flavor variety (vanilla, chocolate, strawberry, coffee, 
20 buffer pecan, and eggnog), plus VARI-FLAVORS® Flavor Pacs in pecan, 

cherry, strawberry, lemon, and orange, help to prevent flavor fatigue and aid in 
patient compliance. 

Vanilla, strawberry, butter pecan, and coffee flavors 



Com Symp 39% 

25 Maltodextrin 38% 

Sucrose 23% 
Chocolate and e^nog flavors 

Com Synq? 36% 
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Maltodextrin 34% 

Sucrose 30% 

Vitamins and Minerals 

An 8-fl-oz serving of ENSURE PLUS provides at least 15% of the RDIs 
5 for 25 key Vitamins and minerals. 

Caffeine 

Chocolate flavor contains 3.1 mg Caffeine/8 fl oz. Coffee flavor 
contains a trace amount of caffeine. 



10 F. ENSURE PLUS® HN 

Usage: ENSURE PLUS HN is a nutritionally complete high-calorie, 
high-nitrogen liquid food designed for people with higher calorie and protein 
needs or limited volume tolerance. It may be used for oral supplementation or 
for total nutritional support by tube. ENSURE PLUS HN is lactose- and gluten* 
15 free. 

Patient Conditions: 

• For patients with increased calorie and protein needs, such as following 
surgery or injury 

• For i>adents with limited volume tolerance and early satiety 
20 Features 

• For supplemental or total nutrition 
e For oral or tube feeding 

• l.SCaVmL 

• High nitrogen 
25 • Calorically dense 

Ingredients 
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Vanilla: ®-D Water, Maltodextiin (Com), Sodium and Calcium Caseinates, 
Com Oil, Sugar (Sucrose), Soy Protein Isolate, Magnesium Chloride, Potassium 
Citrate, Calcium Phosphate Tribasic, Soy Lecithin, Natural and Artificial 
Flavor, Sodium Citrate, Choline Chloride, Ascorbic Acid, Taurine, L-Camitine, 
S Zinc Sulfate, Ferrous Sulfate, Alpha-Tocopheryl Acetate, Niacinamide, 

Carrageenan, Calcium Pantothenate, Manganese Sulfate, Cupric Sulfate, 
Thiamine Chloride Hydrochloride, Pyridoxine Hydrochloride, Riboflavin, 
Vitanun A Palmitate, Folic Acid, Biotin, Chromium Chloride, Sodium 
Molybdate, Potassium Iodide, Sodium Selenite, Phylloquinone, 
1 0 Cyanocobalamin and Vitamin D3. 

G. ENSURE® POWDER 

Usage: ENSURE POWDER (reconstituted with water) is a low-residue 
liquid food designed primarily as an oral nutritional supplement to be used with 
15 or between meals. ENSURE POWDER is lactose- and gluten-free, and is 

suitable for use in niodified diets, including low-cholesterol diets. 

Patient Conditions: 

• For patients on modified diets 

• For elderly patients at nutrition risk 

20 • For patients recovering from illness/surgeiy 

• For patients who need a low-residue diet 
Features 

• Convenient, easy to mix 

• Low in saturated fat 

25 • Contains 9 g of total fat and < 5 mg of cholesterol per serving 

• High in vitamins and minerals 

• For low-cholesterol diets 

• Lactose-free, easily digested 
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Ingredients: ®-D Com Syrup, Maltodextrin (Com), Sugar (Sucrose), Com Oil, 
Sodium and Calcium Caseinates, Soy Protein Isolate, ArtijBcial Flavor, 
Potassium Citrate, Magnesium Chloride, Sodium Citmte, Calcium Phosphate 
Tribasic, Potassium Chloride, Soy Lecithin, Ascorbic Acid, Choline Chloride, 
5 Zinc Sulfate, Ferrous Sulfate, Alpha-Tocopheiyl Acetate, Niacinamide, 

Calcium Pantothenate, Manganese Sulfate, Thiamine Chloride Hydrochloride, 
Cupric Sulfate, Pyridoxine Hydrochloride, Riboflavin, Vitamin A Palmitate, 
Folic Acid, Biotin, Sodium Molybdate, Chromium Chloride, Potassium Iodide, 
Sodium Selenate, Phylloquinone, Vitamin D3 and Cyanocobalamin. 

10 Protein 

The protein source is a blend of two high-biologic- value proteins: casein 
and soy. 

Sodium and calciiun caseinates 84% 
Soy protein isolate 16% 

15 Fat 

The fat source is com oil. 

Com oil 100% 

Carbohydrate 

ENSURE POWDER contains a combination of com syrap, 
20 maltodextrin, and sucrose. The mild sweetness of ENSURE POWDER, plus 

VARI-FLAVORS® Flavor Pacs in pecan, cherry, strawberry, lemon, and 
orange, helps to prevent flavor fatigue and aid in patient compliance. 



Vanilla 

Com Syrup 35% 

25 Maltodextrin 35% 

Sucrose 30% 
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H. ENSURE® PUDDING 

Usage: ENSURE PUDDING is a nutrient-dense supplement providing 
balanced nutrition in a nonliquid form to be used with or between meals. It is 
appropriate for consistency-modified diets (e.g., soft, pureed, or full liquid) or 
5 for people vsdth swallowing impairments. ENSURE PUDDING is gltxten-firee. 

Patient Conditions: 

• For patients on consistency-modified diets (e.g., soft, pureed, or fiill liquid) 

• For patients with swallowing impairments 
Features 

10 • Rich and creamy, good taste 

• Good source of essential vitamins and minerals Convenient-needs no 
refiigeration 

• Gluten-ftee 

Nutrient Profile per 5 oz: Calories 250, Protein 10.9%, Total Fat 34.9%, 
1 5 Carbohydrate 54.2% 

Ingredients: 

Vanilla: ®-D Nonfat Milk, Water, Sugar (Sucrose), Partially Hydrogenated 
Soybean Oil, Modified Food Starch, Magnesium Sulfate. Sodium Stearoyl 
Lactylate, Sodium Phosphate Dibasic, Artificial Flavor, Ascorbic Acid, Zinc 
20 Sidfate, Ferrous Sulfate, Alpha-Tocopheryl Acetate, Choline Chloride, 

Niacinamide, Manganese Sulfate, Calcium Pantothenate, FD&C Yellow #5, 
Potassium Citrate, Cupric Sulfate, Vitamin A Palmitate, Thiamine Chloride 
Hydrochloride, Pyridoxine Hydrochloride, Riboflavin, FD&C Yellow #6, Folic 
Acid, Biotin, Phylloquinone, Vitamin D3 and Cyanocobalamin. 

25 Protein 

The protein source is nonfat milk. 

Non&tmilk 100% 
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Fat 

The fat source is hydrogenated soybean oil. 
Hydrogenated soybean oil 100% 
Carbohydrate 

5 ENSURE PUDDING contains a combination of sucrose and modified 

food starch. The mild sweetness and flavor variety (vanilla, chocolate, 
butterscotch, and tapioca) help prevent flavor fatigue. The product contains 9.2 
grams of lactose per serving. 

Vanilla and other nonchocolate flavors 



10 Sucrose 56% 

Lactose 27% 

Modified food starch 17% 
Chocolate 

Sucrose 58% 

15 Lactose 26% 

Modified food starch 1 6% 



L ENSURE® WITH FIBER 

Usi^e: ENSURE WITH FIBER is a fiber-containing, nutritionally 
20 complete liquid food designed for people who can benefit from increased 

dietary fiber and nutrients. ENSURE WITH FIBER is suitable for people who 
do not require a low-residue diet. It can be fed orally or by tube, and can be 
used as a nutritional s\q>plement to a regular diet or, in appropriate amoimts, as 
a meal replacement. ENSURE WITH FIBER is lactose- and gluten-free, and is 
25 suitable for use in modified diets, including low-cholesterol diets. 

Patient Conditions 

• For patients who can benefit from increased dietary fiber and nutrients 
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Features 

• New advanced formula-low in saturated fat, higher in vitamins and minerals 

• Contains 6 g of total £sit and < S mg of cholesterol per serving 

• Rich, creamy taste 

S • Good source of fiber 

• Excellent source of essential vitamins and minerals 

• For low-cholesterol diets 

• Lactose- and gluten-fi:ee 
Ingredients 

10 Vanilla: ®-D Water, Maltodextrin (Com), Sugar (Sucrose), Sodium and 

Calcium Caseinates, Oat Fiber, High-Oleic Safiflower Oil, Canola Oil, Soy 
Protein Isolate, Com Oil, Soy Fiber, Calcium Phosphate Tribasic, Magnesiimi 
Chloride, Potassium Citrate, Cellulose Gel, Soy Lecithin, Potassium Phosphate 
Dibasic, Sodium Citrate, Natural and Artificial Flavors, Choline Chloride, 

15 Magnesium Phosphate, Ascorbic Acid, Cellulose Gum, Potassium Chloride, 

Carrageenan, Ferrous Sulfate, Alpha-Tocopheryl Acetate, Zinc Sulfate, 
Niacinamide, Manganese Sulfate, Calciimi Pantothenate, Cupric Sul&te, 
Vitamin A Palmitate, Thiamine Chloride Hydrochloride, Pyridoxine 
Hydrochloride, Riboflavin, Folic Acid, Chromium Chloride, Biotin, Sodium 

20 Molybdate, Potassium Iodide, Sodium Selenate, Phylloquinone, Vitamin D3 and 

Cyanocobalamin. 

Protein 

The protein soiirce is a blend of two high-biologic-value proteins- casein 
and soy. 

25 Sodium and calcium caseinates 80% 

Soy protein isolate 20% 
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Fat 

The fat source is a blend of three oils: high-oleic safflower, canola, and 



com. 

High-oleic safflower oil 40% 

S Canola oil 40% 

ComoU 20% 



The level of fat in ENSURE WITH FIBER meets American Heart 
Association (AHA) guidelmes. The 6 grams of fat in ENSURE WITH FIBER 
represent 22% of the total calories, with 2.01 % of the fat being from saturated 
10 fatty acids and 6.7% from polyimsaturated fatty acids. These values are within 

the AHA guidelines of < 30% of total calories from fat, < 1 0% of the calories 
from saturated fatty acids, and < 1 0% of total calories from polyunsaturated 
fatty acids. 

Carbohydrate 

1 5 ENSURE WITH FIBER contains a combination of maltodextrin and 

sucrose. The mild sweetness and flavor variety (vanilla, chocolate, and butter 
pecan), plus V ARI-FLAVORS® Flavor Pacs in pecan, cherry, strawberry, 
lemon, and orange, help to prevent flavor fatigue and aid in patient compliance. 

Vanilla and other nonchocolate flavors 



20 Maltodextrin 66% 

Sucrose 25% 

Oat Fiber 7% 

Soy Fiber 2% 
Chocolate 

25 Maltodextrin 55% 

Sucrose 36% 

Oat Fiber 7% 
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Soy Fiber 2% 

Fiber 

The fiber blend used in ENSURE WITH FIBER consists of oat fiber and 
soy polysaccharide* This blend results in approximately 4 grams of total dietary 
S fiber per 8-fl-oz can. The ratio of insoluble to soluble fiber is 95 :5. 

The various nutritional supplements described above and knovm to 
others of skill in the art can be substituted and/or supplemented with the PUFAs 
of this invention. 

J. Oxepa''^ Nutritional Product 

10 Oxepa is low-carbohydrate, czdorically dense enteral nutritional product 

designed for the dietary management of patients with or at risk for ARDS. It 
has a unique combination of ingredients, including a patented oil blend 
containing eicosapentaenoic acid (EPA from fish oil), y-Mnolenic acid (GLA 
fi'om borage oil), and elevated antioxidant levels. 

1 S Caloric Distribution: 

• Caloric density is high at 1.5 Cal/mL (3SS CaI/8 fl oz), to minimize the 
volume required to meet energy needs. 

• The distribution of Calories in Oxepa is shown in Table 7. 



Table 7. Caloric Distribution of Oxepa 




per 8 n oz. 


per liter 


% ofCal 


Calories 


355 


1,500 




Fat(g) 


22.2 


93.7 


55.2 


Carbohydrate (g) 


25 


105.5 


28.1 


Protein (g) 


14.8 


62.5 


16.7 


Water (g) 


186 


785 





20 Fat: 

• Oxepa contains 22.2 g of fat per 8-fl oz serving (93.7 g/L). 

• The fat source is a oil blend of 3 1 .8% canola oil, 25% medium-chain 
triglycerides (MCTs), 20% borage oil, 20% fish oil, and 3.2 % soy lecithin. The 
typical fatty acid profile of Oxepa is shown in Table 8. 
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• Oxepa provides a balanced amount of polyunsaturated, monounsaturated, 
and saturated fatty acids, as shown in Table 10. 

• Medium-chain trigylcerides (MCTs) - 25% of the fat blend — aid gastric 
emptying because they are absorbed by the intestinal tract without 

5 emulsification by bile acids. 

The various fatty acid components of Oxepa™ nutritional product can 
be substituted and/or supplemented with the PUFAs of this invention. 



Table 8. Typical Fatty Acid Profile 





% Total Fatty 
Acids 


g/8floz* 


g/L* 


Caproic (6:0) 


0.2 


0.04 


0.18 


Caprylic (8:0) 


14.69 


3.1 


13.07 


Csqiric (10:0) 


11.06 


2.33 


9.87 


Palmitic (16:0) 


5.59 


1.18 


4.98 


Palmitoleic (16:1 n-?) 


1.82 


0.38 


1.62 


Stearic (18:0) 


1.84 


0.39 


1.64 


01eic(l8:ln-9) 


24.44 


5.16 


21.75 


Linoleic(18:2n-6) 


16.28 


3.44 


14.49 


a-Linolenic (18:3n-3) 


3.47 


0.73 


3.09 


Y'Linolenic (18:3n-6) 


4.82 


1.02 


4.29 


Eicosapentaenoic (20:5n- 
3) 


5.11 


1.08 


4.55 


n-3-Docosapentaenoic 
(22:511-3) 


0.55 


0.12 


0.49 


Docosahexaenoic (22:6n- 
3) 


2.27 


0.48 


2.02 


Others 


7.55 


1.52 


6.72 


* Fatty acids equal approximately 95% of total fet. 


Table 9. Fat Proftle of Oxepa. 


% of total calories from fat 


55.2 


Polyunsaturated fatty acids 


31.44 g/L 


Monoimsaturated fatty acids 


25.53 gA. 


Saturated fatty acids 


32.38 g/L 


n-6 to n-3 ratio 


1.75:1 


CholestBTpl 


9.49 mg/8 fl oz 
40.1 mg/L 



10 
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Carbohydrate: 

• The carbohydrate content is 25.0 g per 8-fl-oz serving (105.5 g/L). 

• The carbohydrate sources are 45% maltodextrin (a complex carbohydrate) 
and 55% sucrose (a simple sugar), both of which are readily digested and 

5 absorbed. 

• The high-fat and low-carbohydrate content of Oxepa is designed to 
minimize carbon dioxide (CO2) production. High CO2 levels can complicate 
Mreaning in ventilator-dependent patients. The low level of carbohydrate also 
may be useful for those patients who have developed stress-induced 

10 hyperglycemia. 

• Oxepa is lactose-free. 

Dietary carbohydrate, the amino acids from protein, and the glycerol 
moiety of fats can be converted to glucose within the body. Throughout this 
process, the carbohydrate requirements of glucose-dependent tissues (such as 

15 the central nervous system and red blood cells) are met. However, a diet free of 

carbohydrates can lead to ketosis, excessive catabolism of tissue protein, and 
loss of fluid and electrolytes. These effects can be prevented by daily ingestion 
of 50 to 100 g of digestible carbohydrate, if caloric intake is adequate. The 
carbohydrate level in Oxepa is also sufficient to minimize giuconeogenesis, if 

20 energy needs are being met. 

Protein: 

• Oxepa contains 14.8 g of protein per 8-fl-oz serving (62.5 g/L). 

• The total calorie/nitrogen ratio (150:1) meets the need of stressed patients. 

• Oxepa provides enough protein to promote anabolism and the maintenance 
25 of lean body mass without precipitating respiratory problems. High protein 

intakes are a concern in patients with respiratory insufficiency. Although 
protein has little effect on CO2 production, a high protein diet will increase 
ventilatory drive. 
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• The protein sources of Oxepa are 86.8% sodium caseinate and 1 3 .2% 
calciiun caseinate. 

• As demonstrated in Table 1 1 , the amino acid profile of the protein system in 
Oxepa meets or surpasses the standard for high quality protein set by 

5 theNational Academy of Sciences. 

• Oxepa is gluten-free. 



All publications and patent applications mentioned in this specification 
are indicative of the level of skill of those skilled in the art to which this 
1 0 invention pertains. All publications and patent applications are herein 

incorporated by reference to the same extent as if each individual publication or 
patent application was specifically and individually indicated to be incorporated 
by reference. 

The invention now being fully described, it will be apparent to one of 
1 5 ordinary skill in the art that many changes and modifications can be made 

thereto without departing from the spirit or scope of the appended claims. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: KNUTZON, DEBORAH 
MURKER JI, PRADIP 
HUANG, YUNG-SHENG 
THURMOND^ JENNIFER 
CHAUDHARY, SUNITA 
LEONARD, AMANDA 

(ii) TITLE OF INVENTION: METHODS AND COMPOSITIONS FOR SYNTHESIS 
OF LONG CHAIN POLY- UN SATURATED FATTY ACIDS 

(iii) NUMBER OF SEQUENCES: 4 0 

<iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: LIMBACH AND LIMBACH LLP 

(B) STREET: 2001 FERRY BUILDING 

(C) CITY: SAN FRANCISCO 

(D) STATE: CA 

(E) COUNTRY: USA 

(F) ZIP: 94111 

<v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Microsoft Word 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) (B) FILING DATE: 
(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: WARD, MICHAEL R. 

(B) REGISTRATION NUMBER: 38,651 

(C) REFERENCE/DOCKET NUMBER: CGAB-210 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (415) 433-4150 

(B) TELEFAX: (415) 433-8716 

(C) TELEX: N/A 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1617 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(11) MOZ^CULE TYPE: other nucleic acid 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
CGACACTCCT TCCTTCTTCT CACCCGTCCT AGTCCCCTTC AACCCCCCTC TTTGACAAAG 
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ACAACAAACC 


ATGGCTGCTG 


CTCCCAGTGT 


GAGGACGTTT 


ACTCGGGCCG 


AGGTTTTGAA 


120 


TGCCGAGGCT 


CTGAATGAGG 


GCAAGAAGGA 


TGCCGAGGCA 


CCCTTCTTGA 


TGATCATCGA 


180 


CAACAAGGTG 


TACGATGTCC 


GCGAGTTCGT 


CCCTGATCAT 


CCCGGTGGAA 


GTGTGATTCT 


240 


CACGCACGTT 


GGCAAGGACG 


GCACTGACGT 


CTTTGACACT 


TTTCACCCCG 


AGGCT6CTTG 


300 


GGAGACTCTT 


GCCAACTTTT 


ACGTTGGTGA 


TATTGACGAG 


AGCGACCGCG 


ATATCAAGAA 


360 


TGATGACTTT 


GCGGCCGAGG 


TCCGCAAGCT 


GCGTACCTTG 


TTCCAGTCTC 


TTGGTTACTA 


420 


CGATTCTTCC 


AAGGCATACT 


ACGCCTTCAA 


GGTCTCGTTC 


AACCTCTGCA 


TCTGGG6TTT 


480 


GTCGACGGTC 


ATTGTGGCCA 


AGTGGGGCCA 


GACCTCGACC 


CTCGCCAACG 


TGCTCTCGGC 


540 


TGCGCTTTTG 


GGTCTGTTCT 


GGCAGCAGTG 


CGGATGGTTG 


GCTCACGACT 


TTTTGCATCA 


600 


CCAGGTCTTC 


CAGGACCGTT 


TCTGGGGTGA 


TCTTTTCGGC 


GCCTTCTTGG 


GAGGT6TCTG 


660 


CCAGGGCTTC 


TCGTCCTCGT 


GGTGGAAGGA 


CAAGCACAAC 


ACTCACCACG 


CCGCCCCCAA 


720 


CGTCCACGGC 


GAGGATCCCG 


ACATTGACAC 


CCACCCTCTG 


TTGACCTGGA 


GTGAGCATGC 


780 


GTTGGAGATG 


TTCTCGGATG 


TCCCAGATGA 


GGAGCTGACC 


CGCATGTGGT 


CGCGTTTCAT 


840 


GGTCCTGAAC 


CAGACCTGGT 


TTTACTTCCC 


CATTCTCTCG 


TTTGCCCGTC 


TCTCCTGGTG 


900 


CCTCCAGTCC 


ATTCTCTTTG 


TGCTGCCTAA 


CGGTCAGGCC 


CACAAGCCCT 


CGGGCGCGCG 


960 


TGTGCCCATC 


TCGTTGGTCG 


AGCAGCTGTC 


GCTTGCGATG 


CACTGGACCT 


GGTACCTCGC 


1020 


CACCATGTTC 


CTGTTCATCA 


AGGATCCCGT 


CAACATGCTG 


GTGTACTTTT 


TGGTGTCGCA 


1080 


GGCGGTGTGC 


GGAAACTTGT 


TGGCGATCGT 


GTTCTCGCTC 


AACCACAACG 


GTATGCCTGT 


1140 


GATCTCGAAG 


GAGGAGGCG6 


TC6ATATGGA 


TTTCTTCACG 


AAGCAGATCA 


TCACGGGTCG 


1200 


TGATGTCCAC 


CCGGGTCTAT 


TTGCCAACTG 


GTTCACGGGT 


GGATTGAACT 


ATCAGATCGA 


1260 


GCACCACTTG 


TTCCCTTCX3A 


TGCCTCGCCA 


CAACTTTTCA 


AAGATCCAGC 


CTGCTGTCGA 


1320 


GACCCTGTGC 


AAAAAGTACA 


ATGTCCGATA 


CCACACCACC 


GGTATGATCG 


AGGGAACTGC 


1380 


AGAGGTCTTT 


AGCCGTCTGA 


ACGAGGTCTC 


CAAGGCTGCC 


TCCAAGATGG 


GTAAGGCGCA 


1440 


GTAAAAAAAA 


AAACAAGGAC 


GTTTTTTTTC 


GCCAGTGCCT 


GTGCCTGTGC 


CTGCTTCCCT 


1500 


TGTCAAGTCG 


AGCGTTTCTG 


GAAAGGATCG 


TTCAGTGCAG 


TATCATCATT 


CTCCTTTTAC 


1560 


CCCCCGCTCA 


TATCTCATTC 


ATTTCTCTTA 


TTAAACAACT 


TGTTCCCCCC 


TTCACCG 


1617 


(2) INFORMATION FOR SEQ ID NO: 2: 











10 



15 



20 



25 



30 



35 



40 



45 



50 



55 

(i) SEQDEHCE CHARACTERISTICS: 

(A) LENGTH: 457 amino acids 

(B) TYPE: amino add 

(C) STRANDEDMESS : not relevant 
60 (0) TOPOLOGY: linear 



(li) MOLECULE TYPE: peptide 



65 
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(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Ala Ala Ala Pro Ser Val Arg Thr Phe Thr Arg Ala Glu Val Leu 
15 10 15 

5 

Asn Ala Glu Ala Leu Asn Glu Gly Lys Lys Asp Ala Glu Ala Pro Phe 
20 25 30 

Leu Met lie He Asp Asn Lys Val Tyr Asp Val Arg Glu Phe Val Pro 
10 35 40 45 

Asp His Pro Gly Gly Ser Val He Leu Thr His Val Gly Lys Asp Gly 
50 55 60 

15 Thr Asp Val Phe Asp Thr Phe His Pro Glu Ala Ala Trp Glu Thr Leu 

65 70 75 80 



20 



35 



50 



65 



Ala Asn Phe Tyr Val Gly Asp He Asp Glu Ser Asp Arg Asp He Lys 
85 90 95 

Asn Asp Asp Phe Ala Ala Glu Val Arg Lys Leu Arg Thr Leu Phe Gin 
100 105 110 



Ser Leu Gly Tyr Tyr Asp Ser Ser Lys Ala Tyr Tyr Ala Phe Lys Val 
25 115 120 125 

Ser Phe Asn Leu Cys He Trp Gly Leu Ser Thr Val He Val Ala Lys 
130 135 140 

30 Trp Gly Gin Thr Ser Thr Leu Ala Asn Val Leu Ser Ala Ala Leu Leu 

145 150 155 160 



Gly Leu Phe Trp Gin Gin Cys Gly Trp Leu Ala His Asp Phe Leu His 
165 170 175 

His Gin Val Phe Gin Asp Arg Phe Trp Gly Asp Leu Phe Gly Ala Phe 
180 185 190 



Leu Gly Gly Val Cys Gin Gly Phe Ser Ser Ser Trp Trp Lys Asp Lys 
40 195 200 205 

His Asn Thr His His Ala Ala Pro Asn Val His Gly Glu Asp Pro Asp 
210 215 220 

45 He Asp Thr His Pro Leu Leu Thr Trp Ser Glu His Ala Leu Glu Met 

225 230 235 240 



Phe Ser Asp Val Pro Asp Glu Glu Leu Thr Arg Met Trp Ser Arg Phe 
245 250 255 

Met Val Leu Asn Gin Thr Trp Phe Tyr Phe Pro He Leu Ser Phe Ala 
260 265 270 



Arg Leu Ser Trp Cys Leu Gin Ser He Leu Phe Val Leu Pro Asn Gly 
55 275 280 285 

Gin Ala His Lys Pro Ser Gly Ala Arg Val Pro He Ser Leu Val Glu 
290 295 300 

60 Gin Leu Ser Leu Ala Met His Trp Thr Trp Tyr Leu Ala Thr Met Phe 

305 310 315 320 



Leu Phe He Lys Asp Pro Val Asn Met Leu Val Tyr Phe Leu Val Ser 
325 330 335 

Gin Ala Val Cys Gly Asn Leu Leu Ala He Val Phe Ser Leu Asn His 
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340 345 350 

Asn Gly Met Pro Val lie Ser Lys Glu Glu Ala Val Asp Met Asp Phe 
355 360 365 

^ Phe Thr Lys Gin lie He Thr Gly Arg Asp Val His Pro Gly Leu Phe 

370 375 380 

Ala Asn Trp Phe Thr Gly Gly Leu Asn Tyr Gin He Glu His His Leu 
10 385 390 395 400 

Phe Pro Ser Met Pro Arg His Asn Phe Ser Lys He Gin Pro Ala Val 
405 410 415 

15 Glu Thr Leu Cys Lys Lys Tyr Asn Val Arg Tyr His Thr Thr Gly Met 

420 425 430 



20 



25 



35 



40 



50 



60 



He Glu Gly Thr Ala Glu Val Phe Ser Arg Leu Asn Glu Val Ser Lys 
435 440 445 

Ala Ala Ser Lys Met Gly Lys Ala Gin 
450 455 

(2) INFORMATION FOR SEQ ID NO: 3: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1488 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
30 <D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 3: 

GTCCCCTGTC GCTGTCGGCA CACCCCATCC TCCCTCGCTC CCTCTGCGTT TGTCCTTGGC 60 

CCACCGTCTC TCCTCCACCC TCCGAGACGA CTGCAACTGT AATCAGGAAC CGACAAATAC 120 

ACGATTTCTT TTTACTCAGC ACCAACTCAA AATCCTCAAC CGCAACCCTT TTTCAGGATG 180 

45 GCACCTCCCA ACACTATCGA TGCCGGTTTG ACCCAGCGTC ATATCAGCAC CTCGGCCCCA 240 

AACTCGGCCA AGCCTGCCTT CGAGCGCAAC TACCAGCTCC CCGAGTTCAC CATCAAGGAG 300 

ATCCGAGAGT GCATCCCTGC CCACTGCTTT GAGCGCTCCG GTCTCCGTGG TCTCTGCCAC 360 

GTTGCCATCG ATCTGACTTG GGCGTCGCTC TTGTTCCTGG CTGCGACCCA GATCGACAAG 420 

TTTGAGAATC CCTTGATCCG CTATTTGGCC TGGCCTGTTT ACTGGATCAT GCAGGGTATT 480 

55 GTCTGCACCG GTGTCTGGGT GCTGGCTCAC GAGTGTGGTC ATCAGTCCTT CTCGACCTCC 540 

AAGACCCTCA ACAACACAGT TGGTTGGATC TTGCACTCGA TGCTCTTGGT CCCCTACCAC 600 

TCCTGGAGAA TCTCGCACTC GAAGCACCAC AAGGCCACTG GCCATATGAC CAAGGACCAG 660 

GTCTTTGTGC CCAAGACCCG CTCCCAGGTT GGCTTGCCTC CCAAGGAGAA CGCTGCTGCT 720 

GCCGTTCAGG AGGAGGACAT GTCCGTGCAC CTGGATGAGG AGGCTCCCAT TGTGACTTTG 780 

65 TTCTGGATGG TGATCCAGTT CTTGTTCGGA TGGCCCGCGT ACCTGATTAT GAACGCCTCT 840 
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6GCCAAGACT 


ACGGCCGCTG 


GACCTCGCAC 


TTCCACACGT 


ACTCGCCCAT 


CTTTGAGCCC 


900 


CGCAACTTTT 


TCGACATTAT 


TATCTCX3GAC 


CTCGGTGTGT 


TGGCTGCCCT 


CGGTGCCCTG 


960 


ATCTATGCCT 


CCATGCAGTT 


GTCGCTCTTG 


ACCGTCACCA 


AGTACTATAT 


TGTCCCCTAC 


1020 


CTCTTTGTCA ACTTTTGGTT 


GGTCCTGATC 


ACCTTCTTGC 


AGCACACCGA 


TCCCAAGCTG 


1080 


CCCCATTACC 


GCGAGGGTGC 


CTGGAATTTC 


CAGCGTGGAG 


CTCTTTGCAC 


CGTTGACCGC 


1140 


TCGTTTGGCA 


AGTTCTTGGA 


CCATATGTTC 


CACG6CATTG 


TCCACACCCA 


TGTGGCCCAT 


1200 


CACTTGTTCT 


CGCAAATGCC 


GTTCTACCAT 


GCTGAGGAAG 


CTACCTATCA 


TCTCAAGAAA 


1260 


CTGCT6GGAG 


AGTACTATGT 


GTACGACCCA 


TCCCCGATCG 


TCGTTGCGGT 


CTGGAGGTCG 


1320 


TTCCGTGAGT 


GCCGATTCGT 


GGAGGATCAG 


GGAGACGTGG 


TCTTTTTCAA 


GAAGTAAAAA 


1380 


AAAAGACAAT 


GGACCACACA 


CAACCTTGTC 


TCTACAGACC 


TACGTATCAT 


GTAGCCATAC 


1440 


CACTTCATAA 


AAGAACATGA 


GCTCTAGAGG 


CGTGTCATTC 


GCGCCTCC 




1488 



<2) INFORMATION FOR SEQ ID NO: 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 399 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



<xi) SEQUENCE OESCRIPTION: SEQ ID NO: 4: 

Met Ala Pro Pro Asn Thr lie Asp Ala Gly Leu Thr Gin Arg His lie 

15 10 15 

Ser Thr Ser Ala Pro Asn Ser Ala Lys Pro Ala Phe Glu Arg Asn Tyr 
20 25 30 

Gin Leu Pro Glu Phe Thr lie Lys Glu lie Arg Glu Cys lie Pro Ala 
35 40 45 

His Cys Phe Glu Arg Ser Gly Leu Arg Gly Leu Cys His Val Ala lie 
50 55 60 

Asp Leu Thr Trp Ala Ser Leu Leu Phe Leu Ala Ala Thr Gin lie Asp 
65 70 75 80 

Lys Phe Glu Asn Pro Leu lie Arg Tyr Leu Ala Trp Pro Val Tyr Trp 
85 90 95 

He Met Gin Gly He Val Cys Thr Gly Val Trp Val Leu Ala His Glu 
100 105 110 

Cys Gly His Gin Ser Phe Ser Thr Ser Lys Thr Leu Asn Asn Thr Val 
115 120 125 

Gly Trp He Leu His Ser Met Leu Leu Val Pro Tyr His Ser Trp Arg 
130 135 140 

He Ser His Ser Lys His His Lys Ala Thr Gly His Met Thr Lys Asp 
145 150 155 160 
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10 



25 



40 



Gin Val Phe Val Pro Lys Thr Arg Ser Gin Val Gly Leu Pro Pro Lys 
165 170 175 

Glu Asn Ala Ala Ala Ala Val Gin Glu Glu Asp Met Ser Val His Leu 
180 185 190 

Asp Glu Glu Ala Pro lie Val Thr Leu Phe Trp Met Val lie Gin Phe 
195 200 205 

Leu Phe Gly Trp Pro Ala Tyr Leu He Met Asn Ala Ser Gly Gin Asp 
210 215 220 



Tyr Gly Arg Trp Thr Ser His Phe His Thr Tyr Ser Pro He Phe Glu 
15 225 230 235 240 

Pro Arg Asn Phe Phe Asp He He He Ser Asp Leu Gly Val Leu Ala 
245 250 255 

20 Ala Leu Gly Ala Leu He Tyr Ala Ser Met Gin Leu Ser Leu Leu Thr 

260 265 270 



Val Thr Lys Tyr Tyr He Val Pro Tyr Leu Phe Val Asn Phe Trp Leu 
275 280 285 

Val Leu He Thr Phe Leu Gin His Thr Asp Pro Lys Leu Pro His Tyr 

290 295 300 



Arg Glu Gly Ala Trp Asn Phe Gin Arg Gly Ala Leu Cys Thr Val Asp 

30 305 310 315 320 

Arg Ser Phe Gly Lys Phe Leu Asp His Met Phe His Gly He Val His 

325 330 335 

35 Thr His Val Ala His His Leu Phe Ser Gin Met Pro Phe Tyr His Ala 

340 345 350 



Glu Glu Ala Thr Tyr His Leu Lys Lys Leu Leu Gly Glu Tyr Tyr Val 

355 360 365 

Tyr Asp Pro Ser Pro He Val Val Ala Val Trp Arg Ser Phe Arg Glu 
370 375 380 



Cys Arg Phe Val Glu Asp Gin Gly Asp Val Val Phe Phe Lys Lys 
45 385 390 395 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 
50 <A) LENGTH: 355 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

55 (ii> MOLECULE TYPE: peptide 



60 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Glu Val Arg Lys Leu Arg Thr Leu Phe Gin Ser Leu Gly Tyr Tyr Asp 
15 10 15 

65 Ser Ser Lys Ala Tyr Tyr Ala Phe Lys Val Ser Phe Asn Leu Cys He 

20 25 30 
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10 



25 



40 



55 



Trp Gly Leu Ser Thr Val lie Val Ala Lys Trp Gly Gin Thr Ser Thr 
35 40 45 

Leu Ala Asn Val Leu Ser Ala Ala Leu Leu Gly Leu Phe Trp Gin Gin 
50 55 60 

Cys Gly Trp Leu Ala His Asp Phe Leu His His Gin Val Phe Gin Asp 
65 70 75 80 

Arg Phe Trp Gly Asp Leu Phe Gly Ala Phe Leu Gly Gly Val Cys Gin 
85 90 95 



Gly Phe Ser Ser Ser Trp Trp Lys Asp Lys His Asn Thr His His Ala 
15 100 105 110 

Ala Pro Asn Val His Gly Glu Asp Pro Asp lie Asp Thr His Pro Leu 
115 120 125 

20 Leu Thr Trp Ser Glu His Ala Leu Glu Met Phe Ser Asp Val Pro Asp 

130 135 140 



Glu Glu Leu Thr Arg Met Trp Ser Arg Phe Met Val Leu Asn Gin Thr 
145 150 155 160 

Trp Phe Tyr Phe Pro lie Leu Ser Phe Ala Arg Leu Ser Trp Cys Leu 
165 170 175 



Gin Ser lie Leu Phe Val Leu Pro Asn Gly Gin Ala His Lys Pro Ser 

30 180 185 190 

Gly Ala Arg Val Pro He Ser Leu Val Glu Gin Leu Ser Leu Ala Met 
195 200 205 

35 His Trp Thr Trp Tyr Leu Ala Thr Met Phe Leu Phe He Lys Asp Pro 

210 215 220 



Val Asn Met Leu Val Tyr Phe Leu Val Ser Gin Ala Val Cys Gly Asn 

225 230 235 240 

Leu Leu Ala lie Val Phe Ser Leu Asn His Asn Gly Met Pro Val He 
245 250 255 



Ser Lys Glu Glu Ala Val Asp Met Asp Phe Phe Thr Lys Gin He He 
45 260 265 270 

Thr Gly Arg Asp Val His Pro Gly Leu Phe Ala Asn Trp Phe Thr Gly 
275 280 285 

50 Gly Leu Asn Tyr Gin He Glu His His Leu Phe Pro Ser Met Pro Arg 

290 295 300 



His Asn Phe Ser Lys He Gin Pro Ala Val Glu Thr Leu Cys Lys Lys 
305 310 315 320 

Tyr Asn Val Arg Tyr His Thr Thr Gly Met He Glu Gly Thr Ala Glu 
325 330 335 



Val Phe Ser Arg Leu Asn Glu Val Ser Lys Ala Ala Ser Lys Met Gly 
60 340 345 350 

Lys Ala Gin 
355 

65 (2) INFORMATION FOR SEQ ID NO: 6: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 104 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Val Thr Leu Tyr Thr Leu Ala Phe Val Ala Ala Asn Ser Leu Gly Val 
X 5 10 15 

Leu Tyr Gly Val Leu Ala Cys Pro Ser Val Xaa Pro His Gin lie Ala 
20 25 30 

Ala Gly Leu Leu Gly Leu Leu Trp lie Gin Ser Ala Tyr lie Gly Xaa 
35 40 45 

Asp Ser Gly His Tyr Val lie Met Ser Asn Lys Ser Asn Asn Xaa Phe 
50 55 60 

Ala Gin Leu Leu Ser Gly Asn Cys Leu Thr Gly lie lie Ala Trp Trp 
65 70 75 80 

Lys Trp Thr His Asn Ala His His Leu Ala Cys Asn Ser Leu Asp Tyr 
85 90 95 

Gly Pro Asn Leu Gin His lie Pro 
100 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 252 amino acids 

(B) TYPE: cOTiino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:7: 

Gly Val Leu Tyr Gly Val Leu Ala Cys Thr Ser Val Phe Ala His Gin 
15 10 15 

lie Ala Ala Ala Leu Leu Gly Leu Leu Trp lie Gin Ser Ala Tyr lie 
20 25 30 

Gly His Asp Ser Gly His Tyr Val lie Met Ser Asn Lys Ser Tyr Asn 
35 40 45 

Arg Phe Ala Gin Leu Leu Ser Gly Asn Cys Leu Thr Gly He Ser He 
50 55 60 

Ala Trp Trp Lys Trp Thr His Asn Ala His His Leu Ala Cys Asn Ser 
65 70 75 80 

Leu Asp Tyr Asp Pro Asp Leu Gin His He Pro Val Phe Ala Val Ser 
85 90 95 
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Thr Lys Phe Phe Ser Ser Leu Thr Ser Arg Phe Tyr Asp Arg Lys Leu 
100 105 110 

Thr Phe Gly Pro Val Ala Arg Phe Leu Val Ser Tyr Gin His Phe Thr 
115 120 125 

Tyr Tyr Pro Val Asn Cys Phe Gly Arg He Asn Leu Phe He Gin Thr 
130 135 140 

Phe Leu Leu Leu Phe Ser Lys Arg Glu Val Pro Asp Arg Ala Leu Asn 
145 150 155 160 

Phe Ala Gly He Leu Val Phe Trp Thr Trp Phe Pro Leu Leu Val Ser 
165 170 175 

Cys Leu Pro Asn Trp Pro Glu Arg Phe Phe Phe Val Phe Thr Ser Phe 
180 185 190 

Thr Val Thr Ala Leu Gin His He Gin Phe Thr Leu Asn His Phe Ala 
195 200 205 

Ala Asp val Tyr Val Gly Pro Pro Thr Gly Ser Asp Trp Phe Glu Lys 
210 215 220 

Gin Ala Ala Gly Thr He Asp He Ser Cys Arg Ser Tyr Met Asp Trp 
225 230 235 240 

Phe Phe Gly Gly Leu Gin Phe Gin Leu Glu His His 
245 250 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH; 125 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 8: 

Gly Xaa Xaa Asn Phe Ala Gly He Leu Val Phe Trp Thr Trp Phe Pro 
15 10 15 

Leu Leu Val Ser Cys Leu Pro Asn Trp Pro Glu Arg Phe Xaa Phe Val 
20 25 30 

Phe Thr Gly Phe Thr Val Thr Ala Leu Gin His He Gin Phe Thr Leu 
35 40 45 

Asn His Phe Ala Ala Asp Val Tyr Val Gly Pro Pro Thr Gly Ser Asp 
50 55 60 

Trp Phe Glu Lys Gin Ala Ala Gly Thr He Asp He Ser Cys Arg Ser 
65 70 75 80 

Tyr Met Asp Trp Phe Phe Cys Gly Leu Gin Phe Gin Leu Glu His His 
85 90 95 

Leu Phe Pro Arg Leu Pro Arg Cys His Leu Arg Lys Val Ser Pro Val 
100 105 110 
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Gly Gin Arg Giy Phe Gin Arg Lys Xaa Asn Leu Ser Xaa 
115 120 125 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 131 amino acids 

(B) TYPE: amino acid 

(C) STRAHDEDNESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Pro Ala Thr Glu Val Gly Gly Leu Ala Trp Met He Thr Phe Tyr Val 
1 5 10 15 

Arg Phe Phe Leu Thr Tyr Val Pro Leu Leu Gly Leu Lys Ala Phe Leu 
20 25 30 

Gly Leu Phe Phe He Val Arg Phe Leu Glu Ser Asn Trp Phe Val Trp 
35 40 45 

Val Thr Gin Met Asn His He Pro Met His He Asp His Asp Arg Asn 
50 55 60 

Met Asp Trp Val Ser Thr Gin Leu Gin Ala Thr Cys Asn Val His Lys 
65 70 75 80 

Ser Ala Phe Asn Asp Trp Phe Ser Gly His Leu Asn Phe Gin He Glu 
85 90 95 

His His Leu Phe Pro Thr Met Pro Arg His Asn Tyr His Xaa Val Ala 

100 105 110 

Pro Leu Val Gin Ser Leu Cys Ala Lys His Giy He Glu Tyr Gin Ser 
115 120 125 



Lys Pro Leu 
130 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 87 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Cys Ser Pro Lys Ser Ser Pro Thr Arg Asn Met Thr Pro Ser Pro Phe 
15 10 15 

He Asp Trp Leu Trp Gly Gly Leu Asn Tyr Gin He Glu His His Leu 
20 25 30 
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Phe Pro Thr Met Pro Arg Cys Asn Leu Asn Arg Cys Met Lys Tyr Val 
35 40 45 

Lys Glu Trp Cys Ala GXu Asn Asn Leu Pro Tyr Leu Val Asp Asp Tyr 
50 55 60 

Phe Val Gly Tyr Asn Leu Asn Leu Gin Gin Leu Lys Asn Met Ala Glu 
65 70 75 80 

Leu Val Gin Ala Lys Ala Ala 
85 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 143 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:ll: 

Arg His Glu Ala Ala Arg Gly Gly Thr Arg Leu Ala Tyr Met Leu Val 
1 5 10 ' 15 

Cys Met Gin Trp Thr Asp Leu Leu Trp Ala Ala Ser Phe Tyr Ser Arg 
20 25 30 

Phe Phe Leu Ser Tyr Ser Pro Phe Tyr Gly Ala Thr Gly Thr Leu Leu 
35 40 45 

Leu Phe Val Ala Val Arg Val Leu Glu Ser His Trp Phe Val Trp lie 
50 55 60 

Thr Gin Met Asn His lie Pro Lys Glu lie Gly His Glu Lys His Arg 
65 70 75 80 

Asp Trp Ala Ser Ser Gin Leu Ala Ala Thr Cys Asn Val Glu Pro Ser 
85 90 95 

Leu Phe He Asp Trp Phe Ser Gly His Leu Asn Phe Gin He Glu His 
100 105 110 

His Leu Phe Pro Thr Met Thr Arg His Asn Tyr Arg Xaa Val Ala Pro 
115 120 125 

Leu Val Lys Ala Phe Cys Ala Lys His Gly Leu His Tyr Glu Val 
130 135 140 

(2) INEXDRMATION FOR SEQ ID NO: 12: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

5 CCAAGCTTCT GCAGGAGCTC tTTTTTTTTT TTTTT 35 

(2) INFORMATION FOR SEQ ID NO: 13: 

Ci) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDl^SS : single 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: other nucleic acid 



20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

CUACUACUAC UAGGAGTCCT CTACGGTGTT TTG 33 
(2) INFORMATION FOR SEQ ID NO: 14; 

25 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
30 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 



35 



40 



50 



55 



65 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
CAUCAUCAOC AUATGATGCT CAAGCTGAAA CTG 33 
(2) INFORMATION FOR SEQ ID NO: 15: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 39 base pairs 
45 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: other nucleic acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
TACCAACTCG AGAAAATGGC TGCTGCTCCC AGTGTGAGG 39 
(2) INFORMATION FOR SEQ ID NO: 16: 



60 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: other nucleic acid 
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10 



20 



25 



35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
AACTGATCTA GATTACTGCG CCTTACCCAT CTTGGAGGC 39 
(2) INFORMATION FOR SEQ ID NO: 17: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
15 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Other nucleic acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
TACCAACTCG AGAAAATGGC ACCTCCCAAC ACTATCGAT 39 
(2) INFORMATION FOR SEQ ID NO: 18: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 39 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: other nucleic acid 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
40 AACTGATCTA GATTACITTCT TGAAAAAGAC CACGTCTCC 39 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 
45 (A) LENGTH: 7 46 nucleic acids 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOL(X3Y: linear 

50 (ii) MOLECULE TYPE: nucleic acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

CGTATGTCAC TCCATTCCAA ACTCGTTCAT GGTATCATAA ATATCUUVCAC ATTTACGCTC 60 

55 CACTCCTCTA TGGTATTTAC AC3VCTCAAAT ATCGTACTCA AGATTGGGAA GCTTTTGTAA 120 

AG6ATC;<5TAA AAATGGTGCA ATTCGTGTTA GTGTCGCCAC AAATTTCGAT AAGGCCGCTT 180 

ACGTCATT(3G TAAATTGTCT TTTGTTTTCT TCCGTTTCAT CCTTCCACTC CGTTATCATA 240 

GCTTTACAGA TTTAATTTGT TATTTCCTCA TTGCTGAATT CGTCTTTGGT TGGTATCTCA 300 

CAATTAATTT CCAAGTTAGT CATGTCGCTG AACSATCTTCAA ATTCTTTGCT ACCCCTGAAA 360 

60 GACCAGATGA ACCATCTCAA ATCAATGAAG ATTGGGCAAT CCTTCAACTT AAAACTACTC 420 

AAGATTATGG TCATGGTTCA CTCCTTTGTA CCTTTTTTAG TGGTTCTTTA AATCATCAAG 480 

TTGTTCATCa^ TTTATTCCCA TCAATTGCTC AAGATTTCTA CCCACAACTT GTACCAATTG 540 

TAAAAGAAGT TTGTAAAGAA CIATAACATTA CTTACCACAT TAAACCAAAC TTCACTGAAG 600 

CTATTATGTC ACACATTAAT TACCTTTACA AAATGGGTAA TGATCCAGAT TATGTTAAAA 660 

65 AACCATTAGC CTCAAAAGAT GATTAAATGA AATAACTTAA AAACCAATTA TTTACTPTTTG 720 
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ACAAACAGTA ATATTAATAA ATACAA 74 6 

(2) INFORMATION FOR SEQ ID NO: 20: 

5 

(i) SCQUENCE CHARACTERISTICS: 

(A) LENGTH: 227 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 
10 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:20: 

15 



20 



25 



30 



35 



40 



45 



50 



60 



Tyr 


Val 


Thr 


Pro 


Phe 


Gin 


Thr 


Arg 


Ser Trp Tyr 


His 


Lys 


Tyr 




1 








5 










10 












His 


lie 


Tyr 


Ala 


Pro 


Leu 


Leu 


Tyr 


uxy 


He 


Tyr 


Thr 


Leu 




Tyr 










20 










25 










30 


Arg 


Thr 


Gin Asp 


Trp 


Glu 


Ala 


Phe 


Val 


Lys Asp Gly Lys 


Asn 


Gly 








35 










40 










45 


Ala 


He 


Arg 


Val 


Ser 


Val 


Ala 


Thr 


Asn 


Phe 


Asp 


Lys 


Ala 


Ala 


Tyr 








50 










55 










60 


Val 


He 


Gly 


Lys 


Leu 


Ser 


Phe 


Val 


Phe 


Phe 


Arg 


Phe 


He 


Leu 


Pro 








65 










70 










75 


Leu 


Arg 


Tyr 


His 


Ser 


Phe 


Thr Asp 


Leu 


He 


Cys 


Tyr 


Phe 


Leu 


He 








80 










85 










90 


Ala 


Glu 


Phe 


Val 


Phe 


Gly 


Trp 


Tyr 


Leu 


Thr 


He 


Asn 


Phe 


Gin 


Val 










95 










100 










105 


Ser 


His 


Val 


Ala 


Glu 


Asp 


Leu 


Lys 


Phe 


Phe 


Ala 


Thr 


Pro 


Glu 


Arg 










110 










115 










120 


Pro 


Asp 


Glu 


Pro 


Ser 


Gin 


He 


Asn 


Glu 


Asp 


Trp 


Ala 


He 


Leu 


Gin 








125 










130 










135 


Leu 


Lys 


Thr 


Thr 


Gin 


Asp 


Tyr 


Gly 


His 


Gly 


Ser 


Leu 


Leu 


Cys 


Thr 








140 










145 










150 


Phe 


Phe 


Ser Gly 


Ser 


Leu 


Asn 


His 


Gin 


Val 


Val 


His 


His 


Leu 


Phe 










155 










160 










165 


Pro 


Ser 


He 


Ala 


Gin 


Asp 


Phe 


Tyr 


Pro 


Gin 


Leu 


Val 


Pro 


He 


Val 










170 










175 










180 


Lys 


Glu 


Val 


Cys 


Lys 


Glu 


His 


Asn 


He 


Thr 


Tyr 


His 


He 


Lys 


Pro 








185 










190 










195 


Asn 


Phe 


Thr 


Glu 


Ala 


He 


Met 


Ser 


His 


He 


Asn Tyr 


Leu 


Tyr 


Lys 










200 










205 










210 


Met 


Gly 


Asn 


Asp 


Pro Asp Tyr Val 


Lys 


Lys 


Pro 


Leu 


Ala 


Ser 


Lys 








215 










220 










225 


Asp Asp 





























(2) INFORMATION FOR SEQ ID NO 21: 



(i) SEQUENCE CHT^RACTERISTICS : 

(A) LENGTH: 4 94 nucleic acids 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 
55 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: nucleic acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 



TTTTGGAAGG NTCCAAGTTN ACCACGGANT NGGCAAGTTN ACGGGGCGGA AANCGGTTTT 60 

CCCCCCAAGC CTTTTGTCGA CTGGTTCTGT GGTGGCTTCC AGTACCAAGT CGACCACCAC 120 

TTATTCCCCA GCCTGCCCCG ACACAATCTG GCCAAGACAC ACGCACTGGT CGAATCGTTC 180 

65 TGCAAGGAGT GGGGTGTCCA GTACCACGAA GCCGACCTCG TGGACGGGAC CATGGAAGTC 240 

TTGCACCATT TGGGCAGCGT GGCCGGCGAA TTCGTCGTGG ATTTTGTACG CGACGGACCC 300 
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GCCATGTAAT CGTCGTTCGT GACGATGCAA GGGTTCACGC ACATCTACAC ACACTCACTC 360 

ACACAACTAG TGTAACTCGT ATAGAATTCG GTGTCGACCT GGACCTTGTT TGACTGGTTG 420 

GGGATAGGGT AGGTAGGCGG ACGCGTGGGT CGNCCCCGGG AATTCTGTGA CCGGTACCTG 480 

GCCCGCGTNA AAGT 494 



(2) INFORMATION FOR SEQ ID NO: 22: 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 87 axtiino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

15 

(ii) MOLECULE TYPE: peptide 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 



Phe Trp 


Lys 


Xxx 


Pro 


Ser 


Xxx 


Pro 


Arg 


Xxx 


Xxx 


Gin 


Val 


Xxx 


Gly 


1 




5 










10 










15 


Ala Glu 


Xxx 


Gly 


Phe 


Pro 


Pro 


Lys 


Pro 


Phe 


Val 


Asp 


Trp 


Phe 


Cys 






20 










25 










30 


Gly Gly 


Phe 


Gin 


Tyr 


Gin 


Val 


Asp 


His 


His 


Leu 


Phe 


Pro 


Ser 


Leu 






35 










40 










45 


Pro Arg 


His 


Asn 


Leu 


Ala 


Lys 


Thr 


His 


Ala 


Leu 


Val 


Glu 


Ser 


Phe 






50 










55 










60 


Cys Lys 


Glu 


Trp 


Gly 
65 


val 


Gin 


Tyr 


His 


Glu 
70 


Ala 


Asp 


Leu 


Val 


Asp 
75 


Gly Thr 


Met 


Glu 


Val 


Leu 


His 


His 


Leu 


Gly 


Ser 


Val 


Ala 


Gly 


Glu 






65 










70 










75 


Phe Val 


Val 


Asp 


Phe 


Val 


Arg 


Asp 


Gly 


Pro 


Ala 


Met 









80 85 

35 



40 (2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 520 nucleic acids 

(B) TYPE: amino acid 

45 (C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: nucleic acid 

SO (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 



GGATGGAGTT CGTCTGGATC GCTGTGCGCT ACGCGACGTG GTTTAAGCGT CATGG<3TGCG 60 

CTTGGGTACA CGCCGGGGCA GTCGTTGGGC ATGTACTTGT GCGCCTTTGG TCTCGGCTGC 120 

55 ATTTACATTT TTCTGCAGTT CGCCGTAAGT CACACCCATT TGCCCGTGAG CAACCCGCSAG 180 

GATCAGCTGC ATTGGCTCGA GTACGCGCGG ACCACyvCTGT GAACATCAGC ACCAAGTCGT 240 

GGTTTGTCAC ATGGTGGATG TCGAACCTCA ACTTTCACaAT CGAGCACCAC CTTTTCCCCA 300 

CGGCGCCCCA GTTCCGTTTC AAGGAGATCA GCCCGCGCGT CGAGGCCCTC TTCAAGC<3CC 360 

ACGGTCTCCC TTACTACGAC ATGCCCTACA CGAGCGCCGT CTCCACCACC TTTGCCAACC 420 

60 TCTACTCCGT CGGCCATTCC GTCGGCCSACG CCAAGCGCGA CTAGCCTCTT TTCCTAGACC 480 

TTAATTCCCC ACCCCACCCC ATGTTCTGTC TTCCTCCCGC 520 



65 



(2) INFORMATION FOR SEQ ID NO: 24: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 153 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

5 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

10 



Met 


Glu 


Phe 


Val 


Trp 


He Ala 


Val 


Arg 


Tyr 


Ala 


Thr 


Trp Phe Lys 


1 








5 








10 






15 


Arg 


His 


Gly 


Cys 


Ala 


Trp Val 


His 


Ala 


Gly 


Ala 


Val 


Val Gly His 








20 








25 






30 


Val 


Leu 


Val 


Arg 


Leu 


Trp Ser 


Arg 


Leu 


His 


Leu 


His 


Phe Ser Ala 










35 








40 






45 


Val 


Arg 


Arg 


Lys 


Ser 


His Pro 


Phe 


Ala 


Arg 


Glu 


Gin 


Pro Gly Gly 








50 








55 






€0 


Ser 


Ala 


Ala 


Leu 


Ala 


Arg Val 


Arg 


Ala 


Asp 


His 


Thr 


Val Asn He 










65 








70 






75 


Ser 


Thr 


Lys 


Ser 


Trp 


Phe Val 


Thr 


Trp 


Trp 


Met 


Ser 


Asn Leu Asn 










80 








85 






90 


Phe 


Gin 


He 


Glu 


His 


His Leu 


Phe 


Pro 


Thr 


Ala 


Pro 


Gin Phe Arg 










95 








100 






105 


Phe 


Lys 


Glu 


He 


Ser 


Pro Arg 


Val 


Glu 


Ala 


Leu 


Phe 


Lys Arg His 








110 








115 






120 


Gly 


Leu 


Pro 


Tyr 


Tyr 


Asp Met 


Pro 


Tyr 


Thr 


Ser 


Ala 


Val Ser Thr 








125 








130 






135 


Thr 


Phe 


Ala 


Asn 


Leu 


Tyr Ser 


Val 


Gly 


His 


Ser 


Val 


Gly Asp Ala 



30 140 145 150 

Lys Arg Asp 



35 (2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 420 nucleic acids 

(B) TYPE: nucleic acid 

40 (C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: nucleic acid 

45 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: 



ACGCGTCCGC CCACGCGTCC GCCGCGAGCA ACTCATCAAG GAAGGCTACT TTGACCCCTC 60 

GCTCCCGCAC ATGACGTACC GCGTGGTCGA GATTGTTGTT CTCTTCGTGC TTTCCTTTTG 120 

50 GCTGATGGGT CAGTCTTCAC CCCTCGCGCT CGCTCTCGGC ATTGTCGTCA GCGGCATCTC 180 

TCAGGGTCGC TGC6GCTGGG TAATGCATGA GATGGGCCAT GGGTCGTTCA CTGGTGTCAT 240 

TTGGCTTGAC GACCGGTTGT GCGAGTTCTT TTACGGCGTT GGTTGTGGCA TGAGCGGTCA 300 

TTACTGGAAA AACCAGCACA GCAAACACCA CGCAGCGCCA AACCGGCTCG AGCACGATGT 360 

AGATCTCAAC ACCTTGCCAT TGGTGGCCTT CAACGAGCGC GTCGTGCGCA AGGTCCGACC 420 



(2) INFORMATION FOR SEQ ID NO: 26: 

60 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 125 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 



65 



(ii) MOLECULE TYPE: peptide 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 



5 


Arg 


Val 


ArCT 


Pro 


Arg 


Val 


Arg 


Arg 


Glu 


Gin 


Leu 


He 


Lys 


Glu Gly 




1 








5 










10 










15 




Tyr 


Phe 


Asp 


Pro 


Ser 


Leu 


Pro 


His 


Met 


Thr 


Tyr 


Arg 


Val 


Va 1 

VOJ. 


Glu 










20 










25 










30 




lie 


Val 


Val 


Leu 


Phe 


Val 


Leu 


Ser 


Phe 


Trp 


Leu 


Met 


Gly 


Gin 


Ser 


10 










35 










40 










45 




Ser 


Pro 


Leu 


Ala 


Leu 
50 


Ala 


Leu 


Gly 


He 


Val 
55 


Val 


Ser 


Gly 


He 


Ser 
60 




Gin 


Gly 


Arg 


Cys 


Gly 


Trp 


Val 


Met 


His 


Glu 


Met 


Gly 


His 


Gly 


Ser 


15 










65 










70 










75 


Phe 


Thr 


Gly 


Val 


He 
65 


Trp 


Leu 


Asp 


Asp 


Arg 
70 


Leu 


Cys 


Glu 


Phe 


Phe 

•75 




Tyr 


Gly 


Val 


Gly 


Cys 
80 


Gly 


Met 


Ser 


Gly 


His 
85 


Tyr 


Trp 


Lys 


Asn 


Gin 
90 




His 


Ser 


Lys 


His 


His 


Ala 


Ala 


Pro 


Asn 


Arg 


Leu 


Glu 


His 


Asp 


Val 


20 








95 










100 










105 




Asp 


Leu 


Asn 


Thr 


Leu 


Pro 


Leu 


Val 


Ala 


Phe 


Asn 


Glu 


Arg 


val 


Val 










110 










115 










120 




Arg 


Lys 


Val 


Arg 


Pro 
125 























25 

(2) INFORMATION FOR SEQ ID NO: 27: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1219 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid (Edited Contig 2692004) 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 



40 GCACGCCGAC CGGCGCCGGG AGATCCTGGC AAAGTATCCA GAGATAAAGT CCTT(^TGAA 60 

ACCTGATCCC AATTTGATAT GGATTATAAT TATGATGGTT CTCACCCAGT TGGGTGCATT 120 

TTACATAGTA AAAGACTTGG ACTGGAAATG GGTCrATATTT GGGGCCTATG CGTTTGGCAG 180 

45 

TTGCATTAAC CACTCAATGA CTCTGGCTAT TCATGA<3ATT GCCCACAATG CTGCCTTTGG 240 

CAACTGCAAA GCAATGTG(3A ATCGCTGGTT TGGAATGTTT GCTAATCTTC CTATTGGGAT 300 

50 TCCATATTCA ATTTCCTTTA AGAGGTATCA CATGGATCAT CATCGGTACC TTGGAGCTGA 360 

TGGCGTCGAT GTACSATATTC CTACCGATTT TGAGG6CTGG TTCTTCTGTA CCGCTTTCAG 420 

AAAGTTTATA TGGGTTATTC TTCAGCCTCT CTTTTATGCC TTTCGACCTC TGTTCATCAA 480 

55 

CCCCAAACCA ATTACGTATC TGGAAGTTAT CAATACCGTG GCACAGGTCA CTTTTGACAT 540 

TTTAATTTAT TACTTTTTGG GAATTAAATC CTTAGTCTAC ATGTTGGCAG CATCTTTACT 600 

60 TGGCCTGGGT TTGCACCCAA TTTCTGGACA TTTTATAGCT GAGCATTACA TGTTCTTAAA 660 

GGGTCATGAA ACTTACTCAT ATTATGGGCC TCTGAATTTA CTTACCTTCA ATGTGGGTTA 720 

TCATAATGAA CATCATGATT TCCCCAACAT TCCTGGAAAA AGTCTTCCAC TGGTGAGGAA 780 

65 

AATAGCAGCT GAATACTATG ACAACCTCCC TCACTACAAT TCCTGGATAA AAGTACTGTA 840 
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TGATTTTGTG ATGGATGATA CAATAAGTCC CTACTCAAGA ATGAAGAGGC ACCAAAAAGG 900 

AGAGATGGTG CTGGAGTAAA TATCATTAGT GCCAAAGGGA TTCTTCTCCA AAACTTTAGA 960 

5 

TGATAAAATG GAATTTTTGC ATTATTAAAC TTGAGACCAG TGATGCTCAG AAGCTCCCCT 1020 

GGCACAATTT CAGAGTAAGA GCTCGGTGAT ACCAAGAAGT GAATCTGGCT TTTAAACAGT 1080 

10 CAGCCTGACT CTGTACTGCT CAGTTTCACT CACAGGAAAC TTGTGACTTG TGTATTATCG 1140 

TCATTGAGGA TGTTTCACTC ATGTCT6TCA TTTTATAAGC ATATCATTTA AAAAGCTTCT 1200 

AAAAAGCTAT TTCGCCAGG 1219 



15 



40 



50 



55 



65 



(2) INFORMATION FOR SEQ ID NO: 26: 



(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 655 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: other nucleic acid (Edited Contig 2153526) 

(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

30 

TTACCTTCTA CGTCCGCTTC TTCCTCACTT ATGTGCCACT ATTGGGGCTG AAAGCTTCCT 60 

GGGCCTTTTC TTCATAGTCA GGTTCCTGGA AAGCAACTGG TTTGTGTGGG TGACACAGAT 120 

35 GAACCATATT CCCATGCACA TTGATCATGA CCGGAACATG GACTGGGTTT CCACCCAGCT 180 

CCAGGCCACA TGCAATGTCC ACAAGTCTGC CTTCAATGAC TGGTTCAGTG GACACCTCAA 240 

CTTCCAGATT GAGCACCATC TTTTTCCCAC GATGCCTCGA CACAATTACC ACAAAGTGGC 300 

TCCCCTGGTG CAGTCCTTGT GTGCCAAGCA TGGCATAGAG TACCAGTCCA AGCCCCTGCT 360 

GTCAGCCTTC GCCGACATCA TCCACTCACT AAAGGAGTCA GGGCAGCTCT GGCTAGATGC 420 

45 CTATCTTCAC CAATAACAAC AGCCACCCTG CCCAGTCTGG AAGAAGAGGA GGAAGACTCT 480 

GGAGCCAAGG CAGAGGGGAG CTTGAGGGAC AATGCCACTA TAGTTTAATA CTCAGAGGGG 540 

GTTGGGTTTG GGGACATAAA GCCTCTGACT CAAACTCCTC CCTTTTATCT TCTAGCCACA 600 

GTTCTAAGAC CCAAAGTGGG GGGTGGACAC AGAAGTCCCT AGGAGGGAAG GAGCT 655 



(2) INFORMATION FOR SEQ ID NO: 29: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 304 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
60 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: other nucleic acid (Edited Contig 3506132} 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
GTCTTTTACT TTGGCAATGG CTGGATTCCT ACCCTCATCA CGGCCTTTGT CCTTGCTACC 60 
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TCTCAGGCCC AAGCTGGATG GCTGCAACAT GATTATGGCC ACCTGTCTGT CTACAGAAAA 120 

CCCAAGTGGA ACCACCTTGT CCACAAATTC GTCATTGGCC ACTTAAAGGG TGCCTCTGCC 180 

5 

AACTGGTGGA ATCATCGCCA CTTCCAGCAC CACGCCAAGC CTAACATCTT CCACAAGGAT 240 

CCCGATGTGA ACATGCTGCA CGTGTTTGTT CTGGGCGAAT GGCAGCCCAT CGAGTACGGC 300 

10 AAGA 304 



(2) INFORMATION FOR SEQ ID NO: 30: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 918 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

20 

(ii) MOLECULE TYPE: other nucleic acid (Edited Contig 3854933) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
25 CAGGGACCTA CCCCGCGCTA CTTCACCTGG GACGAGGTGG CCCAGCGCTC A6GGTGCGAG 60 

GAGCGGTGGC TAGTGATCGA CCGTAAGGTG TACAACATCA GCGAGTTCAC CCGCCGGCAT 120 
CCAGGGG6CT CCCGGGTCAT CAGCCACTAC GCCGGGCAGG ATGCCACGGA TCCCTTTGTG 180 

30 

GCCTTCCACA TCAACAAGGG CCTTGTGAAG AAGTATATGA ACTCTCTCCT GATTGGAGAA 240 
CTGTCTCCAG AGCAGCCCAG CTTTGAGCCC ACCAAGAATA AAGAGCTGAC AGATGAGTTC 300 
35 CGGGAGCTGC GGGCCACAGT GGAGCGGATG GGGCTCATGA AGGCCAACCA TGTCTTCTTC 360 

CTGCTGTACC TGCTGCACAT CTTGCTGCTG GATGGTGGAG CCTGGCTCAC CCTTTGGGTC 420 
TTTGGGACGT CCTTTTTGCC CTTCCTCCTC TGTGCGGTGC TGCTCAGTGC AGTTCAGGCC 480 

40 

CAGGCTGGCT GGCTGCAGCA TGACTTTGGG CACCTGTCGG TCTTCAGCAC CTCAAAGTGG 540 
AACCATCTGC TACATCATTT TGTGATTGGC CACCTGAAGG GGGCCCCCGC CAGTTGGTGG 600 
45 AACCACATGC ACTTCCAGCA CCATGCCAAG CCCAACTGCT TCCGCAAAGA CCCAGACATC 660 

AACATGCATC CCTTCTTCTT TGCCTTGGGG AAGATCCTCT CTGTGGAGCT TGGGAAACAG 7 20 
AAGAAAAAAT ATATGCCGTA CAACCACCAG CACARATACT TCTTCCTAAT TGGGCCCCCA 7 80 

50 

GCCTTGCTGC CTCTCTACTT CCAGTGGTAT ATTTTCTATT TTGTTATCCA GCGAAAGAAG 840 
TGGGTGGACT TGGCCTGGAT CAGCAAACAG GAATACGATG AAGCCGGGCT TCCATTGTCC 900 
55 ACCGCAAATG CTTCTAAA 918 



(2) INFORMATION FOR SEQ ID NO: 31: 

60 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1686 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

65 

(ii) MOLECULE TYPE: other nucleic acid (Edited Contig 2511785) 
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(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 31: 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



GCCACTTAAA 


GGGTGCCTCT 


GCCAACTGGT 


GGAATCATCG 


CCACTTCCAG 


CACCACGCCA 


60 


AGCCTAACAT 


CTTCCACAAG 


GATCCCGATG 


TGAACATGCT 


GCACGTGTTT 


GTTCTGGGCG 


120 


AATGGCAGCC 


CATCGAGTAC 


GGCAAGAAGA 


AGCTGAAATA 


CCTGCCCTAC 


AATCACCAGC 


180 


ACGAATACTT 


CTTCCTGATT 


GGGCCGCCGC 


TGCTCATCCC 


CATGTATTTC 


CAGTACCAGA 


240 


TCATCATGAC 


CATGATCGTC 


CATAAGAACT 


GGGTGGACCT 


GGCCTGGGCC 


GTCAGCTACT 


300 


ACATCCGGTT 


CTTCATCACC 


TACATCCCTT 


TCTACGGCAT 


CCTGGGAGCC 


CTCCTTTTCC 


360 


TCAACTTCAT 


CAGGTTCCTG 


GAGAGCCACT 


GGTTTGTGTG 


GGTCACACAG 


ATGAATCACA 


420 


TCGTCATGGA 


GATTGACCAG 


GAGGCCTACC 


GTGACTGGTT 


CAGTAGCCAG 


CTGACAGCCA 


480 


CCTGCAACGT 


GGAGCAGTCC 


TTCTTCAACG 


ACTGGTTCAG 


TGGACACCTT 


AACTTCCAGA 


540 


TTGAGCACCA 


CCTCTTCCCC 


ACCATGCCCC 


GGCACAACTT 


ACACAAGATC 


GCCCCGCTGG 


600 


TGAAGTCTCT 


ATGTGCCAAG 


CATGGCATTG 


AATACCAGGA 


GAAGCCGCTA 


CTGAGGGCCC 


660 


TGCTGGACAT 


CATCAGGTCC 


CTGAAGAAGT 


CTGGGAAGCT 


GTGGCTGGAC 


GCCTACCTTC 


720 


ACAAATGAAG 


CCACAGCCCC 


CGGGACACCG 


TGGGGAAGGG 


GTGCAGGTGG 


GGTGATGGCC 


780 


AGAGGAATGA 


TGGGCTTTTG 


TTCTGAGGGG 


TGTCCGAGAG 


GCTGGTGTAT 


GCACTGCTCA 


840 


CGGACCCCAT 


GTTGGATCTT 


TCTCCCTTTC 


TCCTCTCCTT 


TTTCTCTTCA 


CATCTCCCCC 


900 


ATAGCACCCT 


GCCCTCATGG 


GACCTGCCCT 


CCCTCAGCCG 


TCAGCCATCA 


GCCATGGCCC 


960 


TCCCAGTGCC 


TCCTAGCCCC 


TTCTTCCAAG 


GAGCAGAGAG 


GTGGCCACCG 


GGGGTGGCTC 


1020 


TGTCCTACCT 


CCACTCTCTG 


CCCCTAAAGA 


TGGGAGGAGA 


CCAGCGGTCC 


ATGGGTCTGG 


1080 


CCTGTGAGTC 


TCCCCTTGCA 


GCCTGGTCAC 


TAGGCATCAC 


CCCCGCTTTG 


GTTCTTCAGA 


1140 


TGCTCTTGGG 


GTTCATAGGG 


GCAGGTCCTA 


GTCGGGCAGG 


GCCCCTGACC 


CTCCCGGCCT 


1200 


GGCTTCACTC 


TCCCTGACGG 


CTGCCATTGG 




CATAuiACivACsky 


CCTGCTTTGT 


1 O CiC\ 
OU 


TACAAAGCTC 


GGGTCTCCCT 


CCTGCAGCTC 


GGTTAAGTAC 


CCGAGGCCTC 


TCTTAAGATG 


1320 


TCCAGGGCCC 


CAGGCCCGCG 


GGCACAGCCA 


GCCCAAACCT 


TGGGCCCTGG 


AAGAGTCCTC 


1380 


CACCCCATCA 


CTAGAGTGCT 


CTGACCCTGG 


GCTTTCACGG 


GCCCCATTCC 


ACCGCCTCCC 


1440 


CAACTTGAGC 


CTGTGACCTT GGGACCAAAG 


GGGGAGTCCC 


TCGTCTCTTG 


TGACTCAGCA 


1500 


GAGGCAGTGG 


CCACGTTCAG 


GGAGGGGCCG 


GCTGGCCTGG 


A6GCTCAGCC 


CACCCTCCAG 


1560 


CTTTTCCTCA 


GGGTGTCCTG 


AGGTCCAAGA 


TTCTGGAGCA 


ATCTGACCCT 


TCTCCAAAGG 


1620 


CTCTGTTATC 


AGCTGGGCAG 


TGCCAGCCAA 


TCCCTGGCCA 


TTTGGCCCCA 


GGGGACGTGG 


1680 


GCCCTG 












1686 



(2) INFORMATION FOR SEQ ID NO: 32: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 1843 base pairs 

(B) TYPE: nucleic acid 

(C) STBAMDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid (Contig 2535) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



GTCTTTTACT 


TTGGCAATGG 


CTGGATTCCT 


ACCCTCATCA 


CGGCCTTTGT 


CCTTGCTACC 


60 


TCTCAGGCCC 


AAGCTGGATG 


GCTGCAACAT 


GATTATGGCC 


ACCTGTCTGT 


CTACAGAAAA 


120 


CCCAAGTGGA 


ACCACCTTGT 


CCACAAATTC 


GTCATTGGCC 


ACTTAAAGGG 


TGCCTCTGCC 


180 


AACTGGTGGA 


ATCATCGCCA 


CTTCCAGCAC 


CACGCCAAGC 


CTAACATCTT 


CCACAAGGAT 


240 


CCCGATGTGA 


ACATGCTGCA 


CGTGTTTGTT 


CTGGGCGAAT 


GGCAGCCCAT 


CGAGTACGGC 


300 


AAGAAGAAGC 


TGAAATACCT 


GCCCTACAAT 


CACCAGCACG 


AATACTTCTT 


CCTGATTGGG 


360 


CCGCCGCTGC 


TCATCCCCAT 


GTATTTCCAG 


TACCAGATCA 


TCATGACCAT 


GATCGTCCAT 


420 


AAGAACTGGG 


TGGACCTGGC 


CTGGGCCGTC 


AGCTACTACA 


TCCGGTTCTT 


CATCACCTAC 


480 


ATCCCTTTCT 


ACGGCATCCT 


GGGAGCCCTC 


CTTTTCCTCA 


ACTTCATCAG 


GTTCCTGGAG 


540 


AGCCACTGGT 


TTGTGTGGGT 


CACACAGATG 


AATCACATCG 


TCATGGAGAT 


TGACCAGGAG 


600 


GCCTACCGTG 


ACTGGTTCAG 


TAGCCAGCTG 


ACAGCCACCT 


GCAACGTGGA 


GCAGTCCTTC 


660 


TTCAACGACT 


GGTTCAGTGG 


ACACCTTAAC 


TTCCAGATTG 


AGCACCACCT 


CTTCCCCACC 


720 


ATGCCCCGGC 


ACAACTTACA 


CAAGATCGCC 


CCGCTGGTGA 


AGTCTCTATG 


TGCCAAGCAT 


780 


GGCATTGAAT 


ACCAGGAGAA 


GCCGCTACTG 


AGGGCCCTGC 


TGGACATCAT 


CAGGTCCCTG 


840 


AAGAAGTCTG 


GGAAGCTGTG 


GCTGGACGCC 


TACCTTCACA 


AATGAAGCCA 


CAGCCCCCGG 


900 


GACACCGTGG 


GGAAGGGGTG 


CAGGTGGGGT 


GATGGCCAGA 


GGAATGATGG 


GCTTTTGTTC 


960 


TGAGGGGTGT 


CCGAGAGGCT 








^ a T oT*PT r"T 

oV9/\l 1 1 X X 




CCCTTTCTCC 


TCTCCTTTTT 


CTCTTCACAT 


CTCCCCCATA 


GCACCCTGCC 


CTCATGGGAC 


1080 


CTGCCCTCCC 


TCAGCCGTCA 


GCCATCAGCC 


ATGGCCCTCC 


CAGTGCCTCC 


TAGCCCCTTC 


1140 


TTCCAAGGAG 


CAGAGAGGTG 


GCCACCGGGG 


GTGGCTCTGT 


CCTACCTCCA 


CTCTCTGCCC 


1200 


CTAAAGATGG 


GAGGAGACCA 


GCGGTCCATG 


GGTCTGGCCT 


GTGAGTCTCC 


CCTTGCAGCC 


1260 


TGGTCACTAG GCATCACCCC 


CGCTTTGGTT 


CTTCAGATGC 


TCTTGGGGTT 


CATAGGGGCA 


1320 


GGTCCTAGTC 


GGGCAGGGCC 


CCTGACCCTC 


CCGGCCTGGC 


TTCACTCTCC 


CTGACGGCTG 


1380 


CCATTGGTCC 


ACCCTTTCAT 


AGAGAGGCCT 


GCTTTGTTAC 


AAAGCTCGGG 


TCTCCCTCCT 


1440 


GCAGCTCGGT 


TAAGTACCCG 


AGGCCTCTCT 


TAAGATGTCC 


AGGGCCCCAG 


GCCCGCGGGC 


1500 


ACAGCCAGCC 


CAAACCrrTGG 


GCCCTGGAAG 


AGTCCTCCAC 


CCCATCACTA 


GAGTGCTCTG 


1560 


ACCCTGGGCT 


TTCACGGGCC 


CCATTCCACC 


GCCTCCCCAA 


CTTGAGCCTG 


TGACCTTGGG 


1620 


ACCAAAGGGG 


GAGTCCCTCG 


TCTCTTGTGA 


CTCAGCAGAG 


GCAGTGGCCA 


CGTTCAGGGA 


1680 
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25 



GGGGCCGGCT GGCCTGGAGG CTCAGCCCAC CCTCCAGCTT TTCCTCAGGG TGTCCTGAGG 1740 

TCCAAGATTC TGGAGCAATC TGACCCTTCT CCAAAGGCTC TGTTATCAGC TGGGCAGTGC 1800 

5 CAGCCAATCC CTGGCCATTT GGCCCCAGGG GACGTGGGCC CTG 1843 

(2) INFORMATION FOR SEQ ID NO: 33: 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2257 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

15 

<ii) MOLECULE TYPE: other nucleic acid (Edited Contig 253538a) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:33: 

20 CAGGGACCTA CCCCGCGCTA CTTCACCTGG GACGAGGTGG CCCAGCGCTC AGGGTGCGAG 60 

GAGCGGTGGC TAGTGATCGA CCGTAAGGTG TACAACATCA GCGAGTTCAC CCGCCGGCAT 120 

CCAGGGGGCT CCCGGGTCAT CAGCCACTAC GCCGGGCAGG ATGCCACGGA TCCCTTTGTG 180 

GCCTTCCACA TCAACAAGGG CCTTGTGAAG AAGTATATGA ACTCTCTCCT GATTGGAGAA 240 

CTGTCTCCAG AGCAGCCCAG CTTTGAGCCC ACCAAGAATA AAGAGCTGAC AGATGAGTTC 300 

30 CGGGAGCTGC GGGCCACAGT GGAGCGGATG GGGCTCATGA AGGCCAACCA TGTCTTCTTC 360 

CTGCTGTACC TGCTGCACAT CTTGCTGCTG GATGGTGCAG CCTGGCTCAC CCTTTGGGTC 420 

TTTGGGACGT CCTTTTTGCC CTTCCTCCTC TGTGCGGTGC TGCTCAGTGC AGTTCAGCAG 480 

GCCCAAGCTG GATGGCTGCA ACATGATTAT GGCCACCTGT CTGTCTACAG AAAACCCAAG 540 

TGGAACCACC TTGTCCACAA ATTCGTCATT GGCCACTTAA AGGGTGCCTC TGCCAACTGG 600 

40 TGGAATCATC GCCACTTCCA GCACCACGCC AAGCCTAACA TCTTCCACAA GGATCCCGAT 660 

GTGAACATGC TGCACGTGTT TGTTCTGGGC GAATGGCAGC CCATCGAGTA CGGCAAGAAG 720 

AAGCTGAAAT ACCTGCCCTA CAATCACCAG CACGAATACT TCTTCCTGAT TGGGCCGCCG 780 

CTGCTCATCC CCATGTATTT CCAGTACCAG ATCATCATGA CCAT6ATCGT CCATAAGAAC 840 

TGGGTGGACC TGGCCTGGGC CGTCAGCTAC TACATCCGGT TCTTCATCAC CTACATCCCT 900 

50 TTCTACGGCA TCCTGGGAGC CCTCCTTTTC CTCAACTTCA TCAGGTTCCT GGAGAGCCAC 960 

TGGTTTGTGT GGGTCACACA GATGAATCAC ATCGTCATGG AGATTGACCA GGAGGCCTAC 1020 

CGTGACTGGT TCAGTAGCCA GCTGACAGCC ACCTGCAACG TGGAGCAGTC CTTCTTCAAC 1080 

GACTGGTTCA GTGGACACCT TAACTTCCAG ATTGAGCACC ACCTCTTCCC CACCATGCCC 1140 

CGGCACAACT TACACAAGAT CGCCCCGCTG GTGAAGTCTC TATGTGCCAA GCATGGCATT 1200 

60 GAATACCAGG AGAAGCCGCT ACTGAGGGCC CTGCTGGACA TCATCAGGTC CCTGAAGAAG 1260 

TCTGGGAAGC TGTGGCTGGA CGCCTACCTT CACAAATGAA GCCACAGCCC CCGGGACACC 1320 

GTGGGGAAGG GGTGCAGGTG GGGTGATGGC CAGAGGAATG ATGGGCTTTT GTTCTGAGGG 1380 

GTGTCCGAGA GGCTGGTGTA TGCACTGCTC ACGGACCCCA TGTTGGATCT TTCTCCCTTT 1440 
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CTCCTCTCCT TTTTCTCTTC ACATCTCCCC CATAGCACCC TGCCCTCATG GGACCTGCCC 1500 

TCCCTCAGCC GTCAGCCATC AGCCATGGCC CTCCCAGTGC CTCCTAGCCC CTTCTTCCAA 1560 

5 

GGAGCAGAGA GGTGGCCACC GGGGGTGGCT CTGTCCTACC TCCACTCTCT GCCCCTAAAG 1620 

ATGGGAGGAG ACCAGCGGTC CATGGGTCTG GCCTGTGAGT CTCCCCTTGC AGCCTGGTCA 1680 

10 CTAGGCATCA CCCCCGCTTT GGTTCTTCAG ATGCTCTTGG GGTTCATAGG GGCAGGTCCT 1740 

AGTCGGGCAG GGCCCCTGAC CCTCCCGGCC TGGCTTCACT CTCCCTGACG GCTGCCATTG 1800 

GTCCACCCTT TCATAGAGAG GCCTGCTTTG TTACAAAGCT CGGGTCTCCC TCCTGCAGCT 1860 

CGGTTAAGTA CCCGAGGCCT CTCTTAAGAT GTCCAGGGCC CCAGGCCCGC GGGCACAGCC 1920 

AGCCCAAACC TTGGGCCCTG GAAGAGTCCT CCACCCCATC ACTAGAGTGC TCTGACCCTG 1980 

20 GGCTTTCACG GGCCCCATTC CACCGCCTCC CCAACTTGAG CCTGTGACCT TGGGACCAAA 2040 

GGGGGAGTCC CTCGTCTCTT GTGACTCAGC AGAGGCAGTG GCCACGTTCA GGGAGGGGCC 2100 

GGCTGGCCTG GAGGCTCAGC CCACCCTCCA GCTTTTCCTC AGGGTGTCCT GAGGTCCAAG 2160 

ATTCTGGAGC AATCTGACCC TTCTCCAAAG GCTCTGTTAT CAGCTGGGCA GTGCCAGCCA 2220 

ATCCCTGGCC ATTTGGCCCC AGGGGACGTG GGCCCTG 2257 



15 



25 



30 



40 



45 



50 



55 



60 
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(2) INFORMATION FOR SEQ ID NO: 34: 



(i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 411 amino acids 
35 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



{ii> MOLECULE TYPE: amino acid (Translation of Contig 2692004) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 



His Ala 


Asp Arg 


Arg 


Arg 


Glu 


He 


Leu 


Ala 


Lys 


Tyr Pro 


Glu 


He 


1 






5 










10 








15 


Lys Ser 


Leu 


Met 


Lys 


Pro Asp 


Pro 


Asn 


Leu 


He 


Trp He 


He 


He 






20 










25 








30 


Met Met 


Val 


Leu 


Thr 


Gin 


Leu 


Gly Ala 


Phe 


Tyr 


He Val 


Lys 


Asp 








35 










40 








45 


Leu Asp 


Trp 


Lys 


Trp 


Val 


He 


Phe 


Gly Ala 


Tyr 


Ala Phe 


Gly 


Ser 








50 










55 








60 


Cys lie 


Asn 


His 


Ser 


Met 


Thr 


Leu 


Ala 


He 


His 


Glu He 


Ala 


His 






65 










70 








75 


Asn Ala 


Ala 


Phe 


Gly 


Asn 


Cys 


Lys 


Ala 


Met 


Trp Asn Arg 


Trp 


Phe 








80 










85 








90 


Gly Met 


Phe 


Ala 


Asn 


Leu 


Pro 


He 


Gly 


He 


Pro 


Tyr Ser 


He 


Ser 






95 










100 








105 


Phe Lys 


Arg 


Tyr 


His 


Met Asp 


His 


His 


Arg 


Tyr 


Leu Gly Ala Asp 








110 










115 








120 


Gly Val Asp Val Asp 


He 


Pro 


Thr 


Asp 


Phe 


Glu Gly Trp 


Phe 


Phe 








125 










130 








135 


Cys Thr 


Ala 


Phe 


Arg 


Lys 


Phe 


He 


Trp 


Val 


He 


Leu Gin 


Pro 


Leu 






140 










145 








150 


Phe Tyr 


Ala 


Phe 


Arg 


Pro 


Leu 


Phe 


He 


Asn 


Pro 


Lys Pro 


He 


Thr 








155 










160 








165 


Tyr Leu 


Glu 


val 


He 


Asn 


Thr 


Val 


Ala 


Gin 


Val 


Thr Phe 


Asp 


He 
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170 










175 










180 




Leu 


lie 


Tyr 


Tyr 


Phe 
185 


Leu 


Gly 


He 


Lys 


Ser 
190 


Leu 


Val 


Tyr 


Met 


Leu 
195 




Ala 


Ala 


Ser 


Leu 


Leu 


Gly 


Leu 


Gly 


Leu 


His 


Pro 


He 


Ser 


Gly 


His 


5 










200 










205 










210 




Phe 


He 


Ala 


Glu 


His 
215 


Tyr 


Met 


Phe 


Leu 


Lys 
220 


Gly 


His 


Glu 


Thr 


Tyr 

225 




Ser 


Tyr 


Tyr 


Gly 


Pro 


Leu 


Asn 


Leu 


Leu 


Thr 


Phe 


Asn 


Val 


Gly 


Xyr 










230 










235 










240 


10 


His 


Asn 


Glu 


His 


His 
245 


Asp 


Phe 


Pro 


Asn 


He 
250 


Pro 


Gly 


Lys 


Ser 


Leu 
255 




Pro 


Leu 


Val 


Arg 


Lys 
260 


He 


Ala 


Ala 


Glu 


Tyr 
265 


Tyr 


Asp 


Asn 


Leu 


Pro 
270 




His 


Tyr 


Asn 


Ser 


Trp 


He 


Lys 


Val 


Leu 


Tyr 


Asp 


Phe 


Val 


Met 


Asp 


15 










275 










280 










285 




Asp 


Thr 


He 


Ser 


Pro 
290 


Tyr 


Ser 


Arg 


Met 


Lys 
295 


Arg 


His 


Gin 


Lys 


Gly 
300 




Glu 


Met 


Val 


Leu 


Glu 
305 


* * * 


He 


Ser 


Leu 


Val 
310 


Pro 


Lys 


Gly 


Phe 


Phe 
315 


20 


Ser 


Lys 


Thr 


Leu Asp 


Asp 


Lys 


Met 


Glu 


Phe 


Leu 


His 


Tyr 


*** 


Thr 












320 










325 










330 




*** 


Asp 


Gin 


*** 


Cys 


Ser 


Glu 


Ala 


Pro 


Leu 


Ala 


Gin 


Phe 


Gin 


Ser 










335 










340 










345 




Lys 


Ser 


Ser 


Val 


He 


Pro 


Arg 


Ser 


Glu 


Ser 


Gly 


Phe 


* 


Thr 


Val 


25 








350 










355 










360 




Ser 


Leu 


Thr 


Leu 


Tyr 


Cys 


Ser 


Val 


Ser 


Leu 


Thr 


Gly Asn Leu 


*** 












365 










370 










375 




Leu 


Val 


Tyr 


Tyr 


Arg 
380 


His 


* *4p 


Gly 


Cys 


Phe 
385 


Thr 


His 


Val 


Cys 


His 
390 


30 


Phe 


He 


Ser 


He 


Ser 
400 


Phe 


Lys 


Lys 


Leu 


Leu 
405 


Lys 


Ser 


Tyr 


Phe 


Ala 
410 



Arg 

(2) INFORMATION EX>R SEQ ID NO: 35: 

35 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 218 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
40 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: amino acid (Translation of Contig 2153526) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

45 

Tyr Leu Leu Arg Pro Leu Leu Pro His Leu Cys Ala Thr He Gly 
15 10 15 

Ala Glu Ser Phe Leu Gly Leu Phe Phe He Val Arg Phe Leu Glu 
50 20 25 30 

Ser Asn Trp Phe Val Trp Val Thr Gin Met Asn His He Pro Met 
35 40 45 

His He Asp His Asp Arg Asn Met Asp Trp Val Ser Thr Gin Leu 
50 55 60 

55 Gin Ala Thr Cys Asn Val His Lys Ser Ala Phe Asn Asp Trp Phe 

65 70 75 

Ser Gly His Leu Asn Phe Gin He Glu His His Leu Phe Pro Thr 
80 85 90 

Met Pro Arg His Asn Tyr His Lys Val Ala Pro Leu Val Gin Ser 
60 95 100 105 

Leu Cys Ala Lys His Gly He Glu Tyr Gin Ser Lys Pro Leu Leu 
110 115 120 

Ser Ala Phe Ala Asp He He His Ser Leu Lys Glu Ser Gly Gin 
125 130 135 

65 Leu Trp Leu Asp Ala Tyr Leu His Gin *♦* Gin Gin Pro Pro Cys 

140 145 150 
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Pro Val 


Trp Lys 


Lys 


Arg 


Arg Lys Thr 


Leu 


Glu 


Pro 


Arg Gin 


Arg 






155 






160 








165 


Gly Ala 


*** Gly 


Thr 


Met 


Pro Leu **♦ 


Phe 


Asn 


Thr 


Gin Arg 


Gly 






170 






175 








180 


Leu Gly 


Leu Gly 


Thr 




Ser Leu ♦** 


Leu 


Lys 


Leu 


Leu Pro 


Phe 






185 






190 








195 


lie Phe 


*** Pro 


Gin 


Phe 


*** Asp Pro 


Lys 


Trp Gly Val Asp 


Thr 






200 






205 








210 


Glu Val 


Pro Arg 


Arg 


Glu Gly Ala 
















215 
















(2) INFORMATION 


FOR 


SEQ 


ID NO:36: 













15 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 86 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: amino acid (Translation of Contig 3506132) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

25 



Val Phe 


Tyr Phe 


Gly 


Asn 


Gly Trp He 


Pro 


Thr 


Leu 


He 


Thr 


Ala 


1 




5 






10 










15 


Phe Val 


Leu Ala 


Thr 


Ser 


Gin Ala Gin 


Ala 


Gly 


Trp 


Leu 


Gin 


His 






20 






25 










30 


Asp Tyr 


Gly His 


Leu 


Ser 


Val Tyr Arg 


Lys 


Pro 


Lys 


Trp Asn 


His 






35 






40 










45 


Leu Val 


His Lys 


Phe 


Val 


He Gly His 


Leu 


Lys 


Gly Ala 


Ser 


Ala 






50 






55 










60 


Asn Trp 


Trp Asn 


His 


Arg 


His Phe Gin 


His 


His 


Ala 


Lys 


Pro 


Asn 






65 






70 










75 


Leu Gly Glu Trp 


Gin 


Pro 


He Glu Tyr 


Gly 


Lys 


Xxx 












80 






85 













(2) INFORMATION FOR SEQ ID NO: 37: 

45 <i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 306 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

50 

(ii) MOLECULE TYPE: amino acid (Translation of Contig 3854933) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

55 



Gin 


Gly 


Pro 


Thr 


Pro 


Arg 


Tyr Phe 


Thr 


Trp 


Asp 


Glu Val 


Ala 


Gin 


1 








5 








10 








15 


Arg Ser Gly 


Cys 


Glu 


Glu Arg Trp 


Leu 


Val 


He 


Asp Arg 


Lys 


Val 










20 








25 








30 


Tyr 


Asn 


He 


Ser 


Glu 


Phe 


Thr Arg 


Arg 


His 


Pro 


Gly Gly 


Ser Arg 










35 








40 








45 


Val 


He 


Ser 


His 


Tyr 


Ala 


Gly Gin 


Asp Ala 


Thr 


Asp Pro 


Phe 


Val 










50 








55 






60 


Ala 


Phe 


His 


He 


Asn 


Lys 


Gly Leu 


Val 


Lys 


Lys 


Tyr Met 


Asn 


Ser 










65 








70 








75 


Leu 


Leu 


He 


Gly 


Glu 


Leu 


Ser Pro 


Glu 


Gin 


Pro 


Ser Phe 


Glu 


Pro 
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on 


















90 




Thr 


Lys 


Asn 


Lys 


IjXU 


Leu 


Thr Asp 


IjXU 


fne 


Arg 


Glu 


Leu Arg 














o c 








1 nn 
X uu 














Thr 


Val 


Glu 


Arg 


Met 


Gly 


Leu Met 


Lys 


Axa 


Asn 


His 


Val 


Phe 


irne 












110 








lie;. 
XXD 










x^u 




Leu 


Leu 


Tyr 


Leu 


Leu 


His 


He Leu 


Leu 


Leu 


Asp 


Gly Ala Ala 


Trp 










125 








XoU 














Leu 


Thr 


Leu 


Trp 


Val 


Phe 


Gly Thr 


Ser 


Phe 


Leu 


Pro 


Phe 


Leu 


Leu 










140 








X43 










1 sn 

X9V 




Cys 




vax 


Leu 


Leu 








AX a 




Ala 


bxy 


Trp 


Leu 










155 








xou 










XOw 




Gin 


His 


Asp 


Phe 


Gly 


nxs 


Leu Ser 


VciX 


Jtriie 




Thr 


Ser 


Lys 


Trp 










170 








X / 3 










180 




Asn 


His 


Leu 


Leu 


His 


His 


Phe Val 


He 


Gly 


His 


Leu 


Lys 


Gly 


Ala 












185 








X 7U 










195 




irzro 






Trp Trp 


Asn 


His Met 


His 


Phe 


Gin 


His 


His 


Ala 


Lvs 












200 








205 










210 




Pro 


Asn 


Cys 


Phe 


Arg 
215 


Lys 


Asp Pro 


Asp 


He 
220 


Asn 


Met 


His 


Pro 


Phe 
225 


20 


Phe 


Phe 


Ala 


Leu 


Gly 
230 


Lys 


He Leu 


Ser 


Val 
235 


Glu 


Leu 


Gly 


Lys 


Gin 
240 




Lys 


Lys 


Lys 


Tyr 


Met 


Pro 


Tyr Asn 


His 


Gin 


His 


Xxx 


Tyr 


Phe 


Phe 










245 








250 










255 




Leu 


He 


Gly 


Pro 


Pro 


Ala 


Leu Leu 


Pro 


Leu 


Tyr 


Phe 


Gin 


Trp 


Tyr 


25 










260 








265 










270 




He 


Phe 


Tyr 


Phe 


Val 
275 


He 


Gin Arg 


Lys 


Lys 
280 


Trp 


Val 


Asp 


Leu 


Ala 
285 




Trp 


He 


Ser 


Lys 


Gin 


Glu 


Tyr Asp 


Glu 


Ala 


Gly Leu 


Pro 


Leu 


Ser 












290 








295 










300 


30 


Thr 


Ala 


Asn 


Ala 


Ser 
305 


Lys 



















(2) INFORMATION FOR SEQ ID NO: 38: 

35 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 566 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
40 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: amino acid (Translation of Contig 2511785) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

45 





His 


Leu 


Lys 


Gly 


Ala 


Ser 


Ala 


Asn 


Trp Trp 


Asn His 


Arg 


His 


Phe 




1 








5 








10 








15 




Gin 


His 


His 


Ala 


Lys 


Pro 


Asn 


He 


Phe His 


Lys Asp 


Pro 


Asp 


Val 


50 










20 








25 








30 




Asn 


Met 


Leu 


His 


Val 
35 


Phe 


Val 


Leu 


Gly Glu 
40 


Trp Gin 


Pro 


He 


Glu 
45 




Tyr Gly 


Lys 


Lys 


Lys 


Leu 


Lys 


Tyr 


Leu Pro 


Tyr Asn His 


Gin 


His 


55 










50 








55 








60 


Glu 


Tyr 


Phe 


Phe 


Leu 
65 


He 


Gly 


Pro 


Pro Leu 
70 


Leu He 


Pro 


Met 


Tyr 
75 




Phe 


Gin 


Tyr 


Gin 


He 
80 


He 


Met 


Thr 


Met He 
85 


Val His 


Lys 


Asn 


Trp 
90 


60 


Val 


Asp 


Leu 


Ala 


Trp 


Ala 


Val 


Ser 


Tyr Tyr 


He Arg 


Phe 


Phe 


He 










95 








100 








105 




Thr 


Tyr 


He 


Pro 


Phe 
110 


Tyr 


Gly 


He 


Leu Gly 
115 


Ala Leu 


Leu 


Phe 


Leu 
120 




Asn 


Phe 


He 


Arg 


Phe 


Leu 


Glu 


Ser 


His Trp 


Phe Val 


Trp 


Val 


Thr 


65 










125 








130 








135 


Gin 


Met 


Asn 


His 


He 


Val 


Met 


Glu 


He Asp 


Gin Glu 


Ala 


Tyr 


Arg 



140 145 ISO 
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Asp Trp Phe Ser Ser Gin Leu Thr Ala Thr Cys Asn Val Glu Gin . 

155 160 165 

Ser Phe Phe Asn Asp Trp Phe Ser Gly His Leu Asn Phe Gin lie 

170 175 180 

S Glu His His Leu Phe Pro Thr Met Pro Arg His Asn Leu His Lys 

185 190 195 

He Ala Pro Leu Val Lys Ser Leu Cys Ala Lys His Gly He Glu 

200 205 210 

Tyr Gin Glu Lys Pro Leu Leu Arg Ala Leu Leu Asp lie He Arg 
10 215 220 225 

Ser Leu Lys Lys Ser Gly Lys Leu Trp Leu Asp Ala Tyr Leu His 

230 235 240 

Lys *** Ser His Ser Pro Arg Asp Thr Val Gly Lys Gly Cys Arg 

245 250 255 

15 Trp Gly Asp Gly Gin Arg Asn Asp Gly Leu Leu Phe ♦** Gly Val 

260 265 270 

Ser Glu Arg Leu Val Tyr Ala Leu Leu Thr Asp Pro Met Leu Asp 

275 280 285 

Leu Ser Pro Phe Leu Leu Ser Phe Phe Ser Ser His Leu Pro His 
20 290 295 300 

Ser Thr Leu Pro Ser Trp Asp Leu Pro Ser Leu Ser Arg Gin Pro 

305 310 315 

Ser Ala Met Ala Leu Pro Val Pro Pro Ser Pro Phe Phe Gin Gly 

320 325 330 

25 Ala Glu Arg Trp Pro Pro Gly Val Ala Leu Ser Tyr Leu His Ser 

335 340 345 

Leu Pro Leu Lys Met Gly Gly Asp Gin Arg Ser Met Gly Leu Ala 

350 355 360 

Cys Glu Ser Pro Leu Ala Ala Trp Ser Leu Gly He Thr Pro Ala 
30 365 370 375 

Leu Val Leu Gin Met Leu Leu Gly Phe He Gly Ala Gly Pro Ser 

380 385 390 

Arg Ala Gly Pro Leu Thr Leu Pro Ala Trp Leu His Ser Pro *** 

400 405 410 

35 Arg Leu Pro Leu Val His Pro Phe He Glu Arg Pro Ala Leu Leu 

415 420 425 

Gin Ser Ser Gly Leu Pro Pro Ala Ala Arg Leu Ser Thr Arg Gly 

430 435 440 

Leu Ser *** Asp Val Gin Gly Pro Arg Pro Ala Gly Thr Ala Ser 
40 445 450 455 

Pro Asn Leu Gly Pro Trp Lys Ser Pro Pro Pro His His *** Ser 

460 465 470 

Ala Leu Thr Leu Gly Phe His Gly Pro His Ser Thr Ala Ser Pro 

475 480 485 

45 Thr Ala Cys Asp Leu Gly Thr Lys Gly Gly Val Pro Arg Leu 

490 495 500 

Leu ♦** Leu Ser Arg Gly Ser Gly His Val Gin Gly Gly Ala Gly 

505 510 515 

Trp Pro Gly Gly Ser Ala His Pro Pro Ala Phe Pro Gin Gly Val 
50 520 525 530 

Leu Arg Ser Lys He Leu Glu Gin Ser Asp Pro Ser Pro Lys Ala 

535 540 545 

Leu Leu Ser Ala Gly Gin Cys Gin Pro He Pro Gly His Leu Ala 

550 555 560 

55 Pro Gly Asp Val Gly Pro Xxx 

565 



(2) INFORMATION FOR SEQ ID NO: 39: 

60 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 619 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
65 (D) TOPOLOGY: linear 
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(11) MOLECULE TYPE: amino acid (Translation o£ Contig 2535) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



Val 


Phe 


Tyr 


Phe 


Gly Asn Gly Trp 


He 


Pro 


Thr 


Leu 


He 


Thr 


Ala 


1 






5 










10 










15 


Phe 


Val 


Leu 


Ala 


Thr 


Ser 


Gin 


Ala 


Gin 


Ala 


Gly Trp 


Leu 


Gin 


His 










20 










25 










30 


Asp Tyr Gly His 


Leu 


Ser 


Val 


Tyr Arg 


Lys 


Pro 


Lys 


Trp 


Asn 


His 










35 










40 










45 


Leu 


Val 


His 


Lys 


Phe 


Val 


He 


Gly 


His 


Leu 


Lys Gly Ala 


Ser 


Ala 








50 










55 










60 


Asn 


Trp 


Trp Asn 


His 


Arg 


His 


Phe 


Gin 


His 


His 


Ala 


Lys 


Pro 


Asn 








65 










70 










75 


lie 


Phe 


His 


Lys 


Asp 


Pro Asp 


Val 


Asn 


Met 


Leu 


His 


Val 


Phe 


Val 








80 










85 










90 


Leu Gly 


Glu 


Trp 


Gin 


Pro 


He 


Glu Tyr Gly 


Lys 


Lys 


Lys 


Leu 


Lys 










95 










100 










105 


Tyr 


Leu 


Pro Tyr 


Asn 


His 


Gin 


His 


Glu Tyr 


Phe 


Phe 


Leu 


He 


Gly 








110 










115 










120 


Pro 


Pro 


Leu 


Leu 


He 


Pro 


Met 


Tyr 


Phe 


Gin 


Tyr 


Gin 


He 


He 


Met 










125 










130 










135 


Thr 


Met: 


lie 


Val 


His 


Lys 


Asn 


Trp 


Val 


Asp 


Leu 


Ala 


Trp 


Ala 


Val 










140 










145 










150 


Ser 


Tyr 


Tyr 


He 


Arg 


Phe 


Phe 


He 


Thr 


Tyr 


He 


Pro 


Phe 


Tyr Gly 








155 










160 










165 


lie 


Leu Gly Ala 


Leu 


Leu 


Phe 


Leu 


Asn 


Phe 


He 


Arg 


Phe 


Leu 


Glu 










170 










175 










180 


Ser 


His 


Trp 


Phe 


Val 


Trp 


Val 


Thr 


Gin 


Met 


Asn 


His 


He 


Val 


Met 








185 










190 










195 


Glu 


lie 


Asp 


Gin 


Glu 


Ala 


Tyr 


Arg 


Asp 


Trp 


Phe 


Ser 


Ser 


Gin 


Leu 










200 










205 










210 


Thr 


Ala 


Thr 


Cys 


Asn 


Val 


Glu 


Gin 


Ser 


Phe 


Phe 


Asn 


Asp 


Trp 


Phe 










215 










220 










225 


Ser 


Gly 


His 


Leu 


Asn 


Phe 


Gin 


He 


Glu 


His 


His 


Leu 


Phe 


Pro 


Thr 










230 










235 










240 


Met 


Pro 


Arg 


His 


Asn 


Leu 


His 


Lys 


He 


Ala 


Pro 


Leu 


Val 


Lys 


Ser 










245 










250 










255 


Leu 


Cys 


Ala 


Lys 


His 


Gly 


He 


Glu 


Tyr 


Gin 


Glu 


Lys 


Pro 


Leu 


Leu 










260 










265 










270 


Arg 


Ala 


Leu 


Leu 


Asp 


He 


He 


Arg 


Ser 


Leu 


Lys 


Lys 


Ser 


Gly 


Lys 










275 










280 










285 


Leu 


Trp 


Leu 


Asp 


Ala 


Tyr 


Leu 


His 


Lys 


itir-k 


Ser 


His 


Ser 


Pro 


Arg 










290 










295 










300 


Asp 


Thr 


Val 


Gly 


Lys 


Gly 


Cys 


Arg 


Trp 


Gly 


Asp 


Gly 


Gin 


Arg 


Asn 










305 










310 










315 


Asp 


Gly 


Leu 


Leu 


Phe 


*** 


Gly Val 


Ser 


Glu 


Arg 


Leu 


Val 


Tyr 


Ala 










320 










325 










330 


Leu 


Leu 


Thr 


Asp 


Pro 


Met 


Leu 


Asp 


Leu 


Ser 


Pro 


Phe 


Leu 


Leu 


Ser 










335 










340 










345 


Phe 


Phe 


Ser 


Ser 


His 


Leu 


Pro 


His 


Ser 


Thr 


Leu 


Pro 


Ser 


Trp 


Asp 










350 










355 










360 


Leu 


Pro 


Ser 


Leu 


Ser 


Arg 


Gin 


Pro 


Ser 


Ala 


Met 


Ala 


Leu 


Pro 


Val 










365 










370 










375 


Pro 


Pro 


Ser 


Pro 


Phe 


Phe 


Gin 


Gly Ala 


Glu 


Arg 


Trp 


Pro 


Pro 


Gly 










380 










385 










390 


val 


Ala 


Leu 


Ser 


Tyr 


Leu 


His 


Ser 


Leu 


Pro 


Leu 


Lys 


Met 


Gly 


Gly 










400 










405 










410 


Asp 


Gin 


Arg 


Ser 


Met 


Gly 


Leu 


Ala 


Cys 


Glu 


Ser 


Pro 


Leu 


Ala 


Ala 










415 










420 










425 


Trp 


Ser 


Leu Gly 


He 


Thr 


Pro 


Ala 


Leu 


Val 


Leu 


Gin 


Met 


Leu 


Leu 










430 










435 










440 


Gly 


Phe 


He 


Gly Ala Gly 


Pro 


Ser 


Arg 


Ala 


Gly 


Pro 


Leu 


Thr 


Leu 










445 










450 










455 
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10 



15 



20 



25 



35 



Pro Ala 


Trp 


Leu 


His 


Ser 


Pro 




Arg 


Leu 


Pro 


Leu 


Val 




Pro 






460 










465 










Ann 


Phe lie 


Glu 


Arg 


Pro 


Ala 


Leu 


Leu 


GXn 


Ser 


oer 


Gly 


Leu 


XT CO 


Pro 






475 










480 










4 OD 


Ala Ala 


Arg 


Leu 


Ser 


Thr 


Arg 


Gly 


Leu 


Ser 


** * 


Asp Val 


Gin 


Gly 






490 










495 












Pro Arg 


Pro 


Ala 


Gly 


Thr 


Ala 


Ser 


Pro 


Asn 


Leu 


Gly 


Pro 


Trp 


Lys 






505 










510 










515 


Ser Pro 


Pro 


Pro 


His 
520 


His 


* 


Ser 


Ala 


Leu 
525 


Thr 


Leu 


Gly 


Phe 


His 


Gly Pro 


His 


Ser 


Thr 


Ala 


Ser 


Pro 


Thr 


*** 


Ala 


Cys 


Asp 


Leu 


Gly 






535 










540 










545 


Thr Lys 


Gly 


Gly 


Val 


Pro 


Arg 


Leu 


Leu 


* ** 


Leu 


Ser 


Arg 


Gly 


Ser 




550 










555 










560 


Gly His 


Val 


Gin 


Gly 


Gly 


Ala 


Gly 


Trp 


Pro 


Gly 


Gly 


Ser 


Ala 


His 






565 










570 










575 


Pro Pro 


Ala 


Phe 


Pro 
580 


Gin 


Gly 


Val 


Leu 


Arg 
585 


Ser 


Lys 


lie 


Leu 


Glu 
590 


Gin Ser 


Asp 


Pro 


Ser 


Pro 


Lys 


Ala 


Leu 


Leu 


Ser 


Ala 


Gly 


Gin 


Cys 






595 










600 










605 


Gin Pro 


lie 


Pro 


Gly 
610 


His 


Leu 


Ala 


Pro 


Gly 
615 


Asp 


Val 


Gly 


Pro 


Xxx 
620 



(2) INFORMATION FOR SEQ ID NO: 40: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 757 amino acids 
30 (B) TYPE: amino acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: amino acid {Translation of Contig 253538a) 
<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 





Gin 


Gly 


Pro 


Thr 


Pro 


Arg 


Tyr 


Phe 


Thr 


Trp 


Asp 


Glu 


Val 


Ala 


Gin 


40 


1 






5 










10 










15 




Arg 


Ser 


Gly 


Cys 


Glu 
20 


Glu 


Arg 


Trp 


Leu 


Val 
25 


He 


Asp 


Arg 


Lys 


Val 
30 




Tyr 


Asn 


He 


Ser 


Glu 


Phe 


Thr 


Arg 


Arg 


His 


Pro 


Gly 


Gly 


Ser 


Arg 










35 










40 










45 


45 


Val 


lie 


Ser 


His 


Tyr 
50 


Ala 


Gly 


Gin 


Asp 


Ala 
55 


Thr 


Asp 


Pro 


Phe 


Val 
60 




Ala 


Phe 


His 


He 


Asn 
65 


Lys 


Gly 


Leu 


Val 


Lys 
70 


Lys 


Tyr 


Met 


Asn 


Ser 
75 




Leu 


Leu 


lie 


Gly 


Glu 


Leu 


Ser 


Pro 


Glu 


Gin 


Pro 


Ser 


Phe 


Glu 


Pro 


50 








80 










85 










90 




Thr 


Lys 


Asn 


Lys 


Glu 


Leu 


Thr 


Asp 


Glu 


Phe 


Arg 


Glu 


Leu 


Arg 


Ala 










95 










100 










105 




Thr 


Val 


Glu 


Arg 


Met 
110 


Gly 


Leu 


Met 


Lys 


Ala 
115 


Asn 


His 


Val 


Phe 


Phe 
120 


55 


Leu 


Leu 


Tyr 


Leu 


Leu 
125 


His 


He 


Leu 


Leu 


Leu 
130 


Asp 


Gly 


Ala 


Ala 


Trp 
135 




Leu 


Thr 


Leu 


Trp 


Val 
140 


Phe 


Gly 


Thr 


Ser 


Phe 
145 


Leu 


Pro 


Phe 


Leu 


Leu 
150 


60 


Cys 


Ala 


Val 


Leu 


Leu 


Ser 


Ala 


Val 


Gin 


Gin 


Ala 


Gin 


Ala 


Gly 


Trp 








155 










160 










165 




Leu 


Gin 


His 


Asp 


Tyr 


Gly 


His 


Leu 


Ser 


Val 


Tyr Arg 


Lys 


Pro 


Lys 












170 










175 










180 




Trp 


Asn 


His 


Leu 


Val 


His 


Lys 


Phe 


Val 


He 


Gly His 


Leu 


Lys 


Gly 


65 










185 










190 










195 


Ala 


Ser 


Ala 


Asn 


Trp 
200 


Trp 


Asn 


His 


Arg 


His 
205 


Phe 


Gin 


His 


His 


Ala 
210 
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Lys Pro Asn lie Phe His Lys Asp Pro Asp Val Asn Met Leu His 
215 220 225 

Val Phe Val Leu Gly Glu Trp Gin Pro lie Glu Tyr Gly Lys Lys 
230 235 240 

5 Lys Leu Lys Tyr Leu Pro Tyr Asn His Gin His Glu Tyr Phe Phe 

245 250 255 

Leu He Gly Pro Pro Leu Leu He Pro Met Tyr Phe Gin Tyr Gin 
260 265 270 

He He Met Thr Met He Val His Lys Asn Trp Val Asp Leu Ala 
10 275 280 285 

Trp Ala Val Ser Tyr Tyr He Arg Phe Phe He Thr Tyr He Pro 
290 295 300 

Phe Tyr Gly He Leu Gly Ala Leu Leu Phe Leu Asn Phe He Arg 
305 310 315 

15 Phe Leu Glu Ser His Trp Phe Val Trp Val Thr Gin Met Asn His 

320 325 330 

He Val Met Glu He Asp Gin Glu Ala Tyr Arg Asp Trp Phe Ser 
335 340 345 

Ser Gin Leu Thr Ala Thr Cys Asn Val Glu Gin Ser Phe Phe Asn 
20 350 355 360 

Asp Trp Phe Ser Gly His Leu Asn Phe Gin He Glu His His Leu 
365 370 375 

Phe Pro Thr Met Pro Arg His Asn Leu His Lys He Ala Pro Leu 
380 385 390 

25 Val Lys Ser Leu Cys Ala Lys His Gly He Glu Tyr Gin Glu Lys 

400 405 410 

Pro Leu Leu Arg Ala Leu Leu Asp He He Arg Ser Leu Lys Lys 
415 420 425 

Ser Gly Lys Leu Trp Leu Asp Ala Tyr Leu His Lys *** Ser His 
30 430 435 440 

Ser Pro Arg Asp Thr Val Gly Lys Gly Cys Arg Trp Gly Asp Gly 
445 450 455 

Gin Arg Asn Asp Gly Leu Leu Phe **♦ Gly Val Ser Glu Arg Leu 
460 465 470 

35 Val Tyr Ala Leu Leu Thr Asp Pro Met Leu Asp Leu Ser Pro Phe 

475 480 485 

Leu Leu Ser Phe Phe Ser Ser His Leu Pro His Ser Thr Leu Pro 
490 495 500 

Ser Trp Asp Leu Pro Ser Leu Ser Arg Gin Pro Ser Ala Met Ala 
40 505 510 515 

Leu Pro Val Pro Pro Ser Pro Phe Phe Gin Gly Ala Glu Arg Trp 
520 525 530 

Pro Pro Gly Val Ala Leu Ser Tyr Leu His Ser Leu Pro Leu Lys 
535 540 545 

45 Met Gly Gly Asp Gin Arg Ser Met Gly Leu Ala Cys Glu Ser Pro 

550 555 560 

Leu Ala Ala Trp Ser Leu Gly He Thr Pro Ala Leu Val Leu Gin 
565 570 575 

Met Leu Leu Gly Phe He Gly Ala Gly Pro Ser Arg Ala Gly Pro 
50 580 585 590 

Leu Thr Leu Pro Ala Trp Leu His Ser Pro *** Arg Leu Pro Leu 
595 600 605 

Val His Pro Phe He Glu Arg Pro Ala Leu Leu Gin Ser Ser Gly 
610 615 620 

55 Leu Pro Pro Ala Ala Arg Leu Ser Thr Arg Gly Leu Ser *** Asp 

625 630 635 

Val Gin Gly Pro Arg Pro Ala Gly Thr Ala Ser Pro Asn Leu Gly 
640 645 650 

Pro Trp Lys Ser Pro Pro Pro His His *** Ser Ala Leu Thr Leu 
60 655 660 665 

Gly Phe His Gly Pro His Ser Thr Ala Ser Pro Thr ♦** Ala Cys 
670 675 680 

Asp Leu Gly Thr Lys Gly Gly Val Pro Arg Leu Leu *** Leu Ser 
685 690 695 

65 Arg Gly Ser Gly His Val Gin Gly Gly Ala Gly Trp Pro Gly Gly 

700 705 710 
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Ser Ala His Pro Pro Ala Phe Pro Gin Gly Val Leu Arg Ser Lys 

715 720 725 

lie Leu Glu Gin Ser Asp Pro Ser Pro Lys Ala Leu Leu Ser Ala 

730 735 740 

Gly Gin Cys Gin Pro He Pro Gly His Leu Ala Pro Gly Asp Val 

745 750 755 

Gly Pro XXX 
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What is claimed is: 

1 . An isolated nucleic acid comprising: 

a nucleotide sequence depicted in SEQ ID NO: 1 or SEQ ID NO: 3. 

5 

2. A polypeptide encoded by a nucleotide sequence according to claim 1. 

3. A purified or isolated polypeptide comprising an amino acid sequence 
depicted in SEQ ID NO: 2 or SEQ ID NO: 4. 

10 

4. An isolated nucleic acid encoding a polypeptide having an amino acid 
sequence depicted in SEQ ID NO: 2 or SEQ ID NO: 4. 

5. An isolated nucleic acid comprising a nucleotide sequence which encodes a 
1 5 polypeptide which desaturates a fatty acid molecule at carbon 6 or 12 from the 

carboxyl end of said polypeptide, wherein said nucleotide sequence has an average 
A/T content of less than about 60%. 

6. The isolated nucleic acid according to Claim 5, wherein said nucleic acid is 
20 derived from a fungus. 

7. The isolated nucleic acid according to Claim 6, wherein said fungus is of the 
genus Mortierella. 

25 8. The isolated nucleic acid according to Claim 7, wherein said fungus is of the 

species Mortierella alpina. 
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9. An isolated nucleic acid, wherein the nucleotide sequence of said nucleic 
acid is depicted in SEQ ID NO: 1. or SEQ ID NO: 3. 

10. An isolated or purified polypeptide which desaturates a fatty acid molecule at 
5 carbon 6 or 12 from the carboxyl end of said polypeptide, wherein said polypeptide 

is a eukaryotic polypeptide or is derived from a eukaryotic polypeptide. 

1 1 . The isolated or purified eukaryotic polypeptide according to Claim 1 0, 
wherein said eukaryotic polypeptide is derived fix>m a fimgus. 

10 

12. A nucleic acid comprising: 

a fimgal nucleotide sequence which is substantially identical to a sequence of at 
least 50 nucleotides in SEQ ID NO: 1 or SEQ ID NO: 3 or is complementary to a 
sequence of at least SO nucleotides in SEQ ID NO: 1 or SEQ ID NO: 3. 

15 

13. An isolated nucleic acid having a nucleotide sequence with at least about 
50% homology to SEQ ID NO: 1 or SEQ ID NO: 3. 

14. An isolated nucleic acid having a nucleotide sequence with at least about 
20 50% homology to sequence encoding an amino acid sequence depicted in SEQ ID 

NO: 2 or SEQ ID NO: 4. 

15. The nucleic acid of claim 14, wherein said amino acid sequence depicted in 
SEQ ID NO: 2 is selected from the group consisting of amino acid residues 50-53, 

25 39-43, 172-176, 204-213, and 390-402. 

16. A nucleic acid construct comprising: 
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a nucleotide sequence depicted in a SEQ ID NO: 1 or SEQ ID NO: 3 linked 
to a heterologous nucleic acid. 

17. A nucleic acid construct comprising: 

a nucleotide sequence depicted in a SEQ ID NO: 1 or SEQ ID NO: 3 
operabiy associated with an expression control sequence functional in a microbial 
cell. 

1 8. The nucleic acid construct according to Claim 1 7, wherein said microbial 
cell is a yeast cell. 

19. The nucleic acid construct according to Claim 17, wherein said nucleotide 
sequence is derived from a fungus. 

20: The nucleic acid construct according to Claim 19, wherein said fungus is of 
the genus Mortierella, 

21 . The nucleic acid construct according to Claim 20, wherein said fungus is of 
the species Mortierella alpina, 

22. A nucleic acid construct comprising: 

a fungal nucleotide sequence which encodes a polypeptide comprising an 
amino acid sequence which corresponds to or is complementary to an amino acid 
sequence depicted m SEQ ID NO: 2 or SEQ ID NO: 4, wherein said nucleic acid is 
operabiy associated with an expression control sequence functional in a microbial 
cell, wherein said nucleotide sequence encodes a functionally active polypeptide 
v^ch desaturates a fatty acid molecule at carbon 6 or 12 from the carboxyl end of a 
fatty acid molecule. 



-129- 



wo 98/46763 



PCT/US98/07126 



23. A nucleic acid construct comprising: 

a nucleotide sequence having an A/T content of less than about 60% which 
encodes a functionally active A6-desaturase having an amino acid sequence which 
5 corresponds to or is complementary to all of or a portion of an amino acid sequence 

depicted in a SEQ ID NO: 2, wherein said nucleotide sequence is operably 
associated with a transcription control sequence functional in a yeast cell. 

24. A nucleic acid construct comprising: 

10 a fungal nucleotide sequence which encodes a functionally active A12- 

desaturase having an amino acid sequence which corresponds to or is 
complementary to all of or a portion of an amino acid sequence depicted in a SEQ 
ID NO: 4y wherein said nucleotide sequence is operably associated with a ' 
transcription control sequence functional in a yeast cell. 

15 

25. A recombinant yeast cell comprising: 

a nucleic acid construct according to Claim 23 or Claim 24. 

26. The recombinant yeast cell according to Claim 25, wherein said yeast cell is 
20 a Saccharomyces cell. 

27. A recombinant yeast cell comprising: 

at least one copy of a vector comprising a fungal nucleotide sequence which 
encodes a polypeptide which converts 1 8:2 fatty acids to 1 8:3 fatty acids or 1 8:3 
25 fiatty acids to 1 8:4 fatty acids, wherein said yeast cell or an ancestor of said yeast cell 

was transformed with said vector to produce said recombinant yeast cell, and 
wherein said nucleotide sequence is operably associated with an expression control 
sequence functional in said recombinant yeast cell. 
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28. The recombinant yeast cell according to claim 27, wherein said 
fungal nucleotide sequence is a Moriierella nucleotide sequence. 

S 29. The recombinant yeast cell according to Claim 28, wherein said 

recombinant yeast cell is a Saccharomyces cell. 

30. The microbial cell according to Claim 27, wherein said expression 
control sequence is provided in said expression vector. 

10 

31. A method for production of GL A in a yeast culture, said method 
comprising: 

growing a yeast culture having a plurality of recombinant yeast cells, 
wherein said yeast cells or an ancestor of said yeast cells were transformed with a 
1 5 vector comprising fungal DN A encoding a polypeptide which converts LA to GL A, 

wherein said DNA is operably associated with an expression control sequence 
functional in said yeast cells, under conditions whereby said DNA is expressed, 
whereby GLA is produced ftom LA in said yeast culture. 

20 32. The method according to Claim 3 1 , wherein said fungal DNA is 

Mortierella DNA and said polypeptide is a A6 desaturase. 

33. The method according to Claim 32, wherein Mortierella is of the 
species Mortierella alpina. 

25 

34. The method according to Claim 3 1 , \s4ierein said LA is exogenously 
supplied. 



-131. 



wo 98/46763 



PCT/US98/07126 



35. The method according to Claim 31, wherein said conditions are 
inducible. 

36. A method for production of stearidonic acid in a yeast culture, said 
S method comprising: 

growing a yeast culture having a plurality of recombinant yeast cells, 
wherein said yeast cells or an ancestor of said yeast cells were transformed with a 
vector comprising fungal DNA encoding a polypeptide v^ch converts a-linolenic 
acid to stearidonic acid, wherein said DNA is operably associated with an expression 
10 control sequence functional in said yeast cells, under conditions whereby said DNA 
is expressed, whereby stearidonic acid is produced from a*linolenic acid in said 
yeast culture. 

37. The method according to Claim 36, wherein said fungal DNA is 
IS Mortierella DNA and said polypeptide is a A6 desaturase. 

38. The method accordmg to Claim 37, wherein Mortierella is of the 
species Mortierella alpina, 

20 39, The method according to Claim 36, wherein said a-linolenic acid is 

exogenously supplied. 

40. The method according to Claim 36, wherein said conditions are 
inducible. 

25 

41 . A method for production of linoleic acid in a yeast culture, said 
method comprising: 
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growing a yeast culture having a plurality of recombinant yeast cells, 
wherein said yeast cells or an ancestor of said yeast cells were transformed with a 
vector comprising fungal DN A encoding a polypeptide which converts oleic acid to 
linoleic acid, wherein said DNA is operably associated witii an expression control 
sequence functional in said yeast cells, under conditions whereby said DNA is 
expressed, whereby linoleic acid is produced from oleic acid in said yeast culture. 

42. The method according to Claim 41 , wherein said fungal DNA is 
Mortierella DNA and said polypeptide is a A12 desaturase. 

43. The method according to Claim 42, wherein Mortierella is of the 
species Mortierella alpina. 

44. The method according to Claim 41 , wherein said conditions are 
inducible. 

45. An isolated or purified polypeptide which desaturates a fetty acid 
molecule at carbon 12 from the carboxyl end of said polypeptide, wherein said 
polypeptide is a fungal polypeptide or is derived from a fungal polypeptide. 

46. The isolated or purified polypeptide according to Claim 46, wherein 
said polypeptide is a Mortierrella alpina A12 desaturase. 

47. An isolated or purified polypeptide which desaturates a fatty acid 
molecule at carbon 6 from the carboxyl end of said polypeptide, wherein said 
polypeptide is a fungal polypeptide or is derived from a fungal polypeptide. 
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48. The isolated or purified polypeptide according to Claim 48, wherein 
said polypeptide is a A6 desaturase. 

49. An isolated nucleic acid encoding a polypeptide according to Claim 
5 47 or Claim 49. 



50. The nucleic acid construct according to Claim 23, wherein said 
portion of an amino acid sequence depicted in SEQ.ID, NO: 2 comprises amino 
acids 1 through 457. 

10 

51. A host cell comprising: 

a nucleic acid constmct according to any one of Claims 22 to 24. - 

52. A host cell comprising: 

15 a vector which includes a nucleic acid which encodes a fatty acid desaturase 

derived from Mortierella alpina^ wherein said desaturase has an amino acid 
sequence represented by SEQ ID NO:2, and wherein said nucleotide sequence is 
operably linked to a promoter. 



20 53 . The host cell according to Claim 52, wherein said host cell is a 

eukaryotic cell. 

54. The host cell according to Claim 53, wherem said eukaryotic cell is 
selected from the group consisting of a nuunmalian cell, a plant ceil, an insect cell, a 
25 frmgal cell, an avian cell and an algal cell. 



cell. 



55. 



The host cell according to Claim 54, wherein said host cell is a fungal 
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56. The host cell of Claim 21 , wherein said promoter is exogenously 
supplied to said host cell. 

S 57. A method for production of stearidonic acid in a eukaryotic cell 

culture, said method comprising: 

growing a eukaryotic cell culture having a plurality of recombinant 
eukaryotic cells, wherein said recombinant eukaryotic cells or ancestors of said 
recombinant eukaryotic cells were transformed with a vector comprising fungal 
1 0 DNA encoding a poljrpeptide which converts a-linolenic acid to stearidonic acid, 

wherein said DNA is operably associated with an expression control sequence 
functional in said recombinant eukaryotic cells, under conditions whereby said DNA 
is expressed, whereby stearidonic acid is produced from a-linolenic acid in said 
eukaryotic cell culture. 

15 

58. A method for production of linoleic acid in a eukaryotic cell 
culture, said method comprising: 

growing a eukaryotic cell culture having a plurality of recombinant 
eukaruyotic cells, wherein said recombinant eukaryotic cells or ancestors of said 
20 recombinant eukaryotic cells were transformed with a vector comprising fimgai 

DNA encoding a polypeptide which converts oleic acid to linoleic acid, wherein said 
DNA is operably associated with an expression control sequence functional in said 
recombinant eukaryotic cells, imder conditions whereby said DNA is e3q>ressed, 
whereby linoleic acid is produced from oleic acid in said eukaryotic cell culture. 

25 

59. The method according to Claim 57 or Claim 58, wherein said 
eukaryotic cells are selected from the group consisting of mammalian cells, plant 
cells, insect cells, fungal cells, avian cells and algal cells. 
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60. The method according to Claim 59, wherein said fungal cells are yeast 
cells of the genus Saccharomyces. 

61. A recombinant yeast cell comprising: 

5 (1) at least one nucleic acid construct according to Claim 23 or 24; or 

(2) at least one nucleic acid construct according to Claim 23 and at 
least one nucleic acid construct according to Claim 24. 

62. A recombinant yeast cell comprising: 

10 at least one nucleic acid construct comprising a nucleotide sequence which 

encodes a functionally active A6 desaturase having an amino acid sequence which 
corresponds to or is complementary to all or a portion of an amino acid sequence 
depicted in SEQ ID NO: 2, and at least one nucleic acid constmct comprising a 
nucleotide sequence which encodes a functionally active A12 desaturase having an 

1 5 amino acid sequence v^ch corresponds to or is complementary to all or a portion of 

an amino acid sequence depicted in SEQ ID NO: 4, wherein said nucleic acid 
constructs are operably associated with transcription control sequences functional in 
a yeast celL 

20 63. A method of making GLA, said method comprising: 

growing a recombinant yeast cell according to Claim 62 under conditions 
whereby said nucleotide sequences are expressed , whereby GLA is produced in said 
yeast cell. 

25 64. A method of making GLA, said method comprising: 

growing a recombinant yeast cell according to Claim 61 under conditions 
whereby the nucleotide sequences in said nucleic acid constructs are expressed , 
whereby GLA is produced in said yeast cell. 
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65. A method for obtaining altered long chain polyunsaturated fatty acid 
biosynthesis comprising the steps of: 

growing a plant having cells which contain one or more transgenes, derived 
from a fungus or algae, which encodes a transgene expression product which 
5 desaturates a fatty acid molecule at a carbon selected from the group consisting of 

carbon 6 and carbon 12 from the carboxyl end of said fatty acid molecule, wherein 
said one or more trangenes is operably associated with an expression control 
sequence, under conditions whereby said one or more transgenes is expressed, 
\sdiereby long chain polyunsaturated fatty acid biosynthesis in said cells is altered. 

10 

66. The method according to claim 65, wherein said long chain 
polyunsaturated fatty acid is selected from the group consisting of 18:lci>9, LA, 
GLA, SDA and ALA. 

15 67. A microbial oil or fraction thereof produced according to the method 

of claim 65. 

68. A method of treating or preventing malnutrition comprising 
administering said microbial oil of claim 67 to a patient in need of said treatment or 

20 prevention in an amount sufficient to effect said treatment or prevention. 

69. A pharmaceutical composition comprising said microbial oil or 
fraction of claim 67 and a pharmaceutically acceptable carrier. 

25 70. The pharmaceutical composition of claim 69, wherein said 

pharmaceutical composition is in the form of a solid or a liquid. 

71 . The pharmaceutical composition of claim 70, wherein said 
pharmaceutical composition is in a capsule or tablet form. 
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72. The pharmaceutical composition of claim 69 further comprising at 
least one nutrient selected from the group consisting of a vitamin, a mineral, a 
carbohydrate, a sugar, an amino acid, a fi«e fatty acid, a phospholipid, an 

S antioxidant, and a phenolic compound. 

73. A nutritional formula comprising said microbial oil or fraction 
thereof of claim 67. 

1 0 74. The nutritional formula of claim 73, wherein said nutritional formula 

is selected from the group consisting of an infant fomiula, a dietary supplement, and 
a dietary substitute. 

75. The nutritional formula of claim 74, wherein said infant formula, 
1 S dietary supplement or dietary supplement is in the form of a liquid or a solid. 

76. An infant fomiula comprising said microbial oil or fraction thereof of 
claim 67. 



20 77. The infimt fomiula of claim 76 further comprising at least one 

macronutrient selected from the group consisting of coconut oil, soy oil, canola oil, 
mono- and diglycerides, glucose, edible lactose, electrodialysed whey, 
electrodialysed skim milk, milk whey, soy protein, and other protein hydrolysates. 

25 78. The infant fomiula of claim 77 further comprising at least one 

vitamin selected from the group consisting of Vitamins A, C, D, E, and B complex; 
and at least one mineral selected from the group consisting of calciiun, magnesium, 
zinc, manganese, sodium, potassium, phosphorus, copper, chloride, iodine, 
selenium, and iron. 
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79. A dietary supplement comprising said microbial oil or firaction 
thereof of claim 67. 

80. The dietary supplement of claim 79 farther comprising at least one 
macronutrient selected ftom the group consisting of coconut oil, soy oil, canola oil, 
mono- and diglyccrides, glucose, edible lactose, electrodialysed whey, 
electrodialysed skun milk, milk whey, soy protein, and other protein hydrolysates. 

8 1 . The dietary supplement of claim 80 farther comprising at least one 
vitamin selected from the group consisting of Vitamins A, C, D, E, and B complex; 
and at least one mineral selected from the group consisting of calcium, magnesium, 
zinc, manganese, sodiirai, potassium, phosphorus, copper, chloride, iodine, 
selenium, and iron. 

82. The dietary supplement of claim 79 or claim 8 1 , wherein said dietary 
supplement is administered to a human or an animal. 

83. A dietary substitute comprising said microbial oil or fraction thereof 
of claim 67. 

84. The dietary substitute of claim 83 farther comprising at least one 
macronutrient selected from the group consisting of coconut oil, soy oil, canola oil, 
mono- and diglycerides, glucose, edible lactose, electrodialysed whey, 
electrodialysed skim milk, milk whey, soy protein, and other protein hydrolysates. 

85 . The dietary substitute of claim 84 ftirther comprising at least one 
vitamin selected from the group consisting of Vitamins A, C, D, E, and B complex; 
and at least one mineral selected from the group c nsisting of calcium, magnesium, 
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zinc, manganese, sodium, potassium, phosphorus, copper, chloride, iodine, 
selenium, and iron. 

86. The dietary substitute of claim 83 or claim 85, wherein said dietary 
substitute is administered to a human or animal. 

87. A method of treating a patient having a condition caused by 
insufBent intake or production of polyunsaturated fatty acids comprising 
administering to said patient said dietary substitute of claim 83 or said dietary 
supplement of claim 79 in an amount sufficient to effect said treatment. 

88. The method of claim 87, wherein said dietary substitute or said 
dietary supplement is administered enterally or parenterally. 

89- A cosmetic comprising said microbial oil or fraction thereof of claim 

67. 

90. The cosmetic of claim 88, wherein said cosmetic is applied topically. 

91 . The pharmaceutical composition of claim 69, wherein said 
pharmaceutical composition is administered to a human or an animal. 

92. An animal feed comprising said microbial oil or firaction thereof of 
claim 67. 

93. The method of claim 20 wherein said fungus is Mortierella species. 
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94. The method of claim 93 wherein said fungus is Mortierella alpina. 

95. An isolated peptide sequence selected from the group consistmg of 
SEQ ID NO:34 - SEQ ID NO:40. 

5 

96. An isolated peptide sequence selected from the group consisting of 
SEQ ID NO:20. SEQ ID NO:22, SEQ ID NO:25 and SEQ ID NO:26. 

97. A method for production of gamma-linolenic acid in a eukaryotic cell 
1 0 culture, said method comprising: 

growing a eukaryotic cell culture having a plurality of recombinant 
eukaryotic cells, wherein said recombinant eukaryotic cells or ancestors of said 
recombinant eukaryotic cells were transformed with a vector comprising fungal 
DNA encoding a polypeptide which converts linoleic acid to gamma-linolenic acid, 
1 5 \^*ierem said DNA is operably associated with an expression control sequence 

functional in said recombinant eukaryotic cells, under conditions whereby said DNA 
is expressed, whereby gamma-linolenic acid is produced from linoleic acid in said 
eukaryotic cell culture. 

20 98. The method according to Claim 97 wherein said eukaryotic cells are 

selected from the group consisting of manunalian cells, plant cells, insect cells, 
fungal cells, avian cells and algal cells. 
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HIGHER 
SATURATED 
FATTY ACID 



1/20 



PALMITIC ACID (C16) 



DESATURASE 



ELONGASE 
STEARIC ACID (CI 8) 
^"^^ DESATURASE 

If 

18:1 
OLEIC ACID 

(E.G.. PLANTS) 



18:2 A^'^^ 
LINOLEIC ACID 



PALMITOLEIC (C^gA^) 



(E.G., PLANTS) 



a-18:3 A^-''^-''^ 
a-LINOLENIC ACID 



A 6 DESATURASE 



18.4 



,6. 9,12,15 
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METHODS AND COMPOSITIONS FOR SYNTHESIS OF 
LONG CHAIN POLYUNSATURATED FATTY ACIDS 

RFXATED APPLICATION 

5 This application is a continuation in part application of Serial Number 

08/833.610 filed April 11. 1997. 

INTRODUCTION 

Fjjplri ftf the Invention 

This invention relates to modulating levels of enzymes and/or enzyme 
10 components relating to production of long chain poly-unsaturated fatty acids 
(PUFAs) in a microorganism or animal. 



Background 

Two main families of polyunsaturated fatty acids (PUFAs) are the <o3 
15 fatty acids, exemplified by eicosapentaenoic acid (EPA), and the <o6 fatty acids, 
exemplified by arachidonic acid (ARA). PUFAs are important components of 
the plasma membrane of the cell, where they may be found in such forms as 
phospholipids. PUFAs are necessary for proper development, particularly in the 
developing infant brain, and for tissue formation and repair. PUFAs also serve 
20 as precursors to other molecules of importance in human beings and animals, 
including the prostacyclins, eicosanoids, leukotrienes and prostaglandins. 

Four major long chain PUFAs of importance include docosahexaenoic 
acid (DHA) and EPA, which are primarily found in different types of fish oil, 
gamma-linolenic acid (GLA), which is found in the seeds of a number of plants. 
25 including evening primrose {Oenothera biennis), borage {Borago officinalis) 
and black currants (Ribes nigrum), and stearidonic acid (SDA), which is found 
in marine oils and plant seeds. Both GLA and another important long chain 
PUFA. arachidonic acid (ARA), are found in filamentous fungi. ARA can be 
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purified ftom animal tissues including liver and adrenal gland. GLA, ARA, 
EPA and SDA are themselves, or are dietary precursors to, important long chain 
fatty acids involved in prostaglandin synthesis, in treatment of heart disease, 
and in development of Iwain tissue. 
5 Polyunsaturated fatty acids have a number of pharmaceutical and 

medical applications including treatment of heart disease, cancer and arthritis. 

For DHA, a number of sources exist for commercial production 
including a variety of marine organisms, oils obtained from cold water marine 
fish, and egg yolk fractions. For ARA, microorganisms including the genera 
10 Mortierella. Entomophthora, Phytium and Porphyridium can be used for 
commercial production. Commercial sources of SDA include the genera 
Trichodesma and Echium. Commercial sources of GLA include evening 
primrose, black currants and borage. However, there are several disadvantages 
associated with commercial production of PUFAs from natural sources. Natural 
1 5 sources of PUFAs, such as animals and plants, tend to have highly 

heterogeneous oil compositions. The oils obtained from these sources therefore 
can require extensive purification to separate out one or more desired PUFAs or 
to produce an oil which is enriched in one or more PUFA. Natural sources also 
are subject to uncontrollable fluctuations in availability. Fish stocks may 
20 undergo natural variation or may be depleted by overfishing. Fish oUs have 

unpleasant tastes and odors, which may be impossible to economically separate 
from the desired product, and can render such products unacceptable as food 
supplements. Animal oils, and particularly fish oils, can accumulate 
environmental pollutants. Weather and disease can cause fiuctuation in yields 
25 from both fish and plant sources. Cropland available for production of alternate 
oil-producing crops is subject to competition from the steady expansion of 
human populations and the associated increased need for food production on the 
remaining arable land. Crops which do produce PUFAs, such as borage, have 
not been adapted to commercial growth and may not perform well in 
30 monoculture. Growth of such crops is thus not economically competitive where 
more profitable and better established crops can be grown. Large scale 
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fennentation of organisms such as Mortierella is also expensive. Natural 
animal tissues contain low amounts of ARA and are difficult to process. 
Microorganisms such as Porphyridium and Mortierella are difficult to cultivate 
on a commercial scale. 
5 Dietary siqjplements and pharmaceutical formulations containing 

PUFAs can retain tiie disadvantages of tiie PUFA source. Supplements such as 
fish oil capsules can contain low levels of the particular desired component and 
tiius require large dosi^es. High dosages result in ingestion of high levels of 
undesired components, including contaminants. Unpleasant tastes and odors of 
1 0 the supplements can make such regimens imdesirable, and may inhibit 
compliance by the patient. Care must be takeai in providing fatty acid 
supplements, as overaddition may result in suppression of endogenous 
biosyntiietic patiiways and lead to competition witii other necessary fatty acids 
in various lipid fractions in vivo, leadii^ to undesirable results. For example, 
1 5 Eskimos having a diet high in <o3 fatty acids have an increased tendency to 
bleed (U.S. PaL No. 4,874,603). 

A number of enzymes are involved in PUFA biosynthesis. Linolenic 
acid (LA, 18:2 A9, 12) is produced from oleic acid (18:1 A°) by a A12- 
desaturase. GLA (18:3 A6, 9, 12) is produced from linoleic acid (LA, 18:2 A9, 
20 12) by a A6-desattirase. ARA (20:4 A5, 8, 1 1, 14) production from dihomo- 

gamma-linolenic acid (DGLA, 20:3 A8, 1 1, 14) is catalyzed by a A5-desaturase. 
However, animals cannot desaturate beyond the A9 position and' therefore 
cannot convert oleic acid (18:1 A9) into linolenic acid (18:2 A912), Likewise, 
a-linoleic acid (ALA, 18:3 A9, 12, 15) cannot be synthesized by mammals. 
25 Other eukaiyotes, including fungi and plants, have enzymes which desaturate at 
positions A12 and A15. The major poly-unsaturated fatty acids of animals 
therefore are either derived from diet and/or from desaturation and elongation of 
linoleic acid (18:2 A9, 12) or oc-linolenic acid (18:3 A9, 12, 15). Therefore it is 
of interest to obtain genetic material involved in PUFA biosynthesis from 
30 species that naturally produce these fatty acids and to repress the isolated 
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material in a microbial or animal system which can be manipulated to provide 
production of commercial quantities of one or more PUFAs. Thus there is a 
need for fatty acid desaturases, genes encoding them, and recombinant methods 
of producing them. A need further exists for oils containing higher relative 
5 proportions of and/or enriched in specific PUFAs. A need also exists for 
reliable economical methods of producing specific PUFAs. 

Relevant Literature 

Production of gamma-linolenic acid by a A6-desaturase is described in 
USPN 5,552,306. Production of 8, 1 1-eicosadienoic acid using Mortierella 

10 alpina is disclosed in USPN 5,376,541 . Production of docosahexaenoic acid by 
dinoflagellates is described in USPN 5,407,957. Cloning of a A6-palmitoyl- 
acyl carrier protein desaturase is described in PCT publication WO 96/13591 
and USPN 5,614,400. Cloning of a A6-desaturase from borage is described in 
PCT publication WO 96/21022. Cloning of A9-desaturases is described in the 

15 published patent applications PCT WO 91/13972, EP 0 550 162 Al, EP 0 561 
569 A2, EP 0 644 263 A2, and EP 0 736 598 Al, and in USPN 5,057,419. 
Cloning of A12-desaturases from various organisms is described in PCT 
publication WO 94/1 1 5 1 6 and USPN 5,443,974. Cloning of Al 5-desaturases 
from various organisms is described in PCT publication WO 93/1 1245. All 

20 publications and U.S. patents or applications referred to herein are hereby 
incorporated in their entirety by reference. 

Siimmarv of the Invention 

Novel compositions and methods are provided for preparation of poly- 
unsaturated long chain fetty acids or PUFAs. The compositions include nucleic 
25 acids encoding a A5-desaturase and/or polypeptides having A5-desaturase 

activity, the polypeptides, and probes for isolatmg and detecting the same. The 
methods involve growing a host microorganism or animal which contains and 
expresses one or more transgenes encoding a A5-desatuiase and/or a 
polypeptide having A5-desaturase activity. Expression of the desaturase 
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polypeptide provides for a relative increase in A5-desaturated PUFA, or 
metabolic progeny therefrom, as aresult of altered concentrations of enzymes 
and substrates involved in PUFA biosynthesis. The invention finds use for 
example in the large scale production of PUFA containing oils which include, 
5 for example, ARA, EPA and/or DHA. 

In a preferred embodiment, a nucleic acid sequence comprising a A5- 
desaturase depicted in Figure 3 A-D (SEQ ID NO 1), a polypeptide encoded by 
the nucleic acid, and a purified or isolated polypeptide depicted in Figure 3 A-D 
(SEQ ID NO: 2), and an isolated nucleic acid encoding the polypeptide of 
1 0 Figure 3 A-D (SEQ ID NO: 2) are provided. Another embodiment of the 
invention is an isolated nucleic acid sequence which encodes a polypeptide, 
wherein said polypeptide desaturates a fatty acid molecule at carbon 5 from the 
carboxyl end of the molecule. The nucleic acid is preferably derived from a 
eukaryotic cell, such as a fungal cell, or a fimgal cell of the genus Mortierella, 
15 or of the genus/species Mortierella alpina. Also preferred is an isolated nucleic 
acid comprising a sequence which anneals to a nucleotide sequence depicted in 
Figure 3A-3D (SEQ ID NO: 1), and a nucleic acid which encodes an amino acid 
sequence depicted in Figure 3 A-D (SEQ ID NO: 2). In particular, the nucleic 
acid encodes an amino acid sequence depicted in Figure 3A-D (SEQ ID NO: 2) 
20 which is selected from the group consisting of amino acid residues 30-38, 41- 

44, 171-175, 203-212, and 387-394. In an additional embodiment, the invention 
provides an isolated or purified polypeptide which desaturates a fatty acid 
molecule at carbon 5 from the carboxyl end of the molecule. Also provided is 
an isolated nucleic acid sequence which hybridizes to a nucleotide sequence 
25 depicted in Figure 3A-D (SEQ ID NO 1), an isolated nucleic acid sequence 
having at least about 50% identity to Figure 3A-D (SEQ ID NO 1). 

The present invention further includes a nucleic acid construct 
comprising a nucleotide sequence depicted in a Figure 3A-D (SEQ ID NO: 1) 
linked to a heterologous nucleic acid; a nucleic acid construct comprising a 
30 nucleotide sequence depicted in a Figure 3A-D (SEQ ID NO: 1) operably linked 
to a promoter; and a nucleic acid construct comprising a nucleotide sequence 
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depicted in a Figure 3A-D (SEQ ID NO: 1) operably linked to a promoter which 
is functional in a microbial cell. In a preferred embodiment, the microbial cell 
is a yeast cell, and the nucleotide sequence is derived from a fungus, such as a 
fungus of the genus Mortierella, particularly a fungus of the species Mortierella 
5 alpina. 

In another embodiment of the invention, a nucleic acid construct is 
provided which comprises a nucleotide sequence which encodes a polypeptide 
comprising an amino acid sequence which corresponds to or is complementary 
to an amino acid sequence depicted in Figure 3A-D (SEQ ID NO: 2), wherein 
1 0 the nucleotide sequence is operably linked to a promoter which is functional in 
a host cell, and wherein the nucleotide sequence encodes a polypeptide which 
desatuiates a fatty acid molecule at carbon 5 from the carboxyl end of a fatty 
acid molecule. Additionally, provided by the invention is a nucleic acid 
construct comprising a nucleotide sequence which encodes a functionaUy 
1 5 active A5-desaturase, where tiie desaturase includes an amino acid sequence 
which corresponds to or is complementary to all of or a portion of an amino 
acid sequence depicted in a Figure 3 A-D (SEQ ID NO: 2), wherein the 
nucleotide sequence is operably linked to a promoter functional in a host cell. 
The invention also includes a host cell comprising a nucleic acid 
20 construct of the invention. In a preferred embodiment, a recombinant host cell 
is provided which comprises at least one copy of a DNA sequence which 
encodes a functionally active Mortierella alpina fatty acid desaturase having an 
amino acid sequence as depicted in Figure SA-D (SEQ ID NO: 2), wherein the 
cell or an ancestor of the cell was transformed with a vector comprising said 
25 DNA sequence, and wherein the DNA sequence is operably linked to a 
promoter. The host cell is eitiier eukaryotic or prokaryotic. Preferred 
eukaryotic host cells are those selected from tiie group consisting of a 
mammalian cell, an insect cell, a fungal cell, and an algae cell. Preferred 
mammalian cells include an avian cell, a fungal cell such as a yeast, and a 
30 marine algae cell. Preferred prokaryotic cells include those selected from the 
group consisting of a bacteria, a cyanobacteria, cells which contain a 
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bacteriophage, and/or a virus. The DNA sequence of the recombinant host cell 
preferably contains a promoter which is functional in the host cell. 

The host cells of the invention which contain the DNA sequences of the 
invention are enriched for fatty acids, such as 20:3 fatty acids. In a preferred 
5 embodiment, the host cells are enriched for 20:4 fatty acids as compared to an 
untransformed host cell which is devoid of said DNA sequence, and/or enriched 
for 20:5 fatty acids compared to an untransformed host cell which is devoid of 
said DNA sequence. In yet another preferred embodiment, the invention 
provides a recombinant host cell which comprises a fatty acid selected from the 
1 0 group consisting of a dihomo-y-linolenic acid, n-6 eicosatrienoic acid, 20:3n-6 
acid and 20:3 (8,11,14) acid. 

The present invention also includes method for production of 
arachidonic acid in a microbial cell culture, where the method comprises 
growing a microbial cell culture having a plumlity of microbial cells which 
15 contain one or more nucleic acids encoding a polypeptide which converts 
dihomo-y-linolenic acid to arachidonic acid, wherein the nucleic acid is 
operably linked to a promoter, under conditions whereby said one or more 
nucleic acids are expressed, whereby arachidonic acid is produced in the 
microbial cell culture. In several preferred embodiments of the invention, the 
20 polypeptide is an enzyme which desaturates a fatty acid molecule at carbon 5 
from the carboxyl end of the fatty acid molecule; the nucleic acid is derived 
from a Mortierella sp,\ and the substrate for said polypeptide is exogenously 
supplied. The microbial cells used in the methods can be either eukaryotic 
cells or prokaryotic cells. The preferred eukaryotic cells are those selected 
25 from the group consisting of a mammalian cell, an insect cell, a fungal cell, and 
an algae cell. Preferred mammalian cells include an avian cell, a preferred 
fungal cell is a yeast, and the preferred algae cell is a marine algae cell. The 
preferred prokaryotic cells include those selected from the group consistmg of a 
bacteria, a cyanobacteria, cells which contain a bacteriophage, and/or a virus. 
30 The nucleic acid sequence encoding the polypeptide of the microbial cell 
preferably contains a promoter which is functional in the host cell which 
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optionally is an inducible promoter for example by components of the culture 
broth. The preferred microbial cells used in the methods are yeast cells, such 
as Saccharomyces cells. 

In another embodiment of the invention, a recombuiant yeast cell is 
5 provided which converts greater than about 5% of 20:3 fatty acid substrate to a 
20:4 fatty acid product 

Also provided is an oil comprising one or more PUFA. The amount of 
said one or more PUFAs is approximately 0.3-30% arachidonic acid (ARA), 
approximately 0.2-30% dihomo-y-linolenic acid (DGLA), and approximately 
1 0 0.2-30% Y-linolenic acid (GL A). A preferred oil of the invention is one in 
which the ratio of ARA:DGLA:GLA is approximately 1.0:19.0:30 to 
6.0: 1 .0:0.2. Another preferred embodiment of the invention is a pharmaceutical 
composition comprising the oils in a pharmaceutically acceptable carrier. 
Further provided is a nutritional composition comprising the oils of the 
1 5 invention. The nutritional compositions of the invention preferably are 
administered to a mammalian host parenterally or internally. A preferred 
composition of the invention for internal consumption is an infant formula. In a 
preferred embodiment, the nutritional compositions of the invention are in a 
liquid form or a solid form. 
20 The present invention also includes a method for desaturating a fatty 

acid, where the method comprises culturing a recombinant microbial cell of the 
invention under conditions suitable for expression of a polypeptide encoded by 
the nucleic acid, wherein the host cell further comprises a fatty acid substrate of 
the polypeptide. In a preferred embodiment, a fatty acid desaturated by the 
25 methods is provided, including an oil comprising the fatty acid. 

The present invention is also directed to purified nucleotide and peptide 
sequences presented in SEQ ID NO: 1-34. The present invention is further 
directed toward methods of using the sequences presented in SEQ ID NO: 1-34 
as probes to identify related sequences, as components of expression systems 
30 and as components of systems useful for producing transgenic oil. 
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The present invention is further directed to methods of obtaining altered 
long chain poly unsaturated fatty acid biosystems by growing transgenic 
microbes which encode transgene expression products which desaturate a fatty 
acid molecule at carbon 5 from the carboxyl end of the fatty acid molecule. 

5 The present invention is further directed to formulas, dietary 

supplements or dietary supplements in the form of a liquid or a solid containing 
the long chain fatty acids of the invention. These formulas and supplements 
may be administered to a human or an animal. 

The fomiulas and supplements of the invention may further comprise at 

1 0 least one macronutrient selected from the group consisting of coconut oil, soy 
oil, canola oil, mono- and diglycerides, glucose, edible lactose, electrodialysed 
whey, electrodialysed skim milk, milk whey, soy protein, and other protein 
hydrolysates. 

The formulas of the present invention may further include at least one 
1 5 vitamin selected from the group consisting of Vitamins A, C, D, E, and B 
complex; and at least one mineral selected from the group consisting of 
calcium, magnesium, zinc, manganese, sodium, potassium, phosphorus, copper, 
chloride, iodine, seleniimi, and iron. 

The present invention is further directed to a method of treating a patient 
20 having a condition caused by insuffient intake or production of polyunsaturated 
fatty acids comprising administering to the patient a dietary substitute of the 
invention in an amount sufficient to effect treatment of the patient. 

The present invention is further directed to cosmetic and pharmaceutical 
compositions of the material of the invention. 
25 The present invention is also directed to an isolated nucleotide sequence 

comprising a nucleuotide sequence selected from the group consisting of: SEQ 
IDNO:13; SEQIDNO:15; SEQIDNO:!?; SEQIDNO:19; SEQIDNO:21; 
SEQIDNO:22; SEQIDNO:22; SEQIDNO:24; SEQ ID NO:25; SEQ ID 
NO:26 and SEQ ID NO:27. 
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The present invention is also directed to an isolated peptide sequence 
comprising a peptide sequence selected from the group consisting of: SEQ ID 
NO:14; SEQIDNO:16; SEQIDNO:18; SEQIDNO:20; SEQIDNO:28; 
SEQIDNO:29; SEQIDNO:30; SEQIDNO:31; SEQ ID NO:32; SEQ ID 
5 NO:33 and SEQ ID NO:34. 

The present invention is further directed to transgenic oils in 
pharmaceuticaliy acceptable carriers. The present invention is further directed 
to nutritional supplements, cosmetic agents and infant formulae containing 
transgenic oils. 

10 The present invention is further directed to a method for obtaining 

altered long chain polyunsaturated fatty acid biosynthesis comprising the steps 
of: growing a microbe having cells which contain a transgene which encodes a 
transgene expression product which desaturates a fatty acid molecule at carbon 
5 from the carboxyl end of said fatty acid molecule, wherein the trangene is 

1 5 operably associated with an expression control sequence, under conditions 

whereby the transgene is expressed, whereby long chain polyunsaturated fatty 
acid biosynthesis in the cells is altered. 

The present invention is fiulher directed to the use of chain 
polyunsaturated fatty acid selected from the group consisting of ARA, DGLA 
20 and EPA. 

The present invention is further directed toward pharmaceutical 
compositions comprising at least one nutrient selected from the group consisting 
of a vitamin, a mineral, a carbohydrate, a sugar, an amino acid, a free fatty acid, 
a phospholipid, an antioxidant, and a phenolic compound. 
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Rricf Description of the Drawings 

Figure 1 shows possible pathways for the synthesis of arachidonic acid 
(20:4 AS, 8, 11, 14) and stearidonic acid (18:4 A6, 9, 12, 15) ftom palmitic acid 
(Ci6) from a variety of organisms, including algae, Mortierella and humans. 
5 These PUF As can serve as precursors to other molecules important for humans 
and other animals, including prostacyclins, leukotrienes, and prostaglandins, 
some of which are shown. 

Figure 2 shows possible pathways for production of PUFAs in addition 
to ARA, including EPA and DHA, for a variety of organisms. 
10 Figure 3 A-D shows the DNA sequence of the Mortierella alpina A5- 

desaturase and the deduced amino acid sequence. 

Figure 4 shows the deduced amino acid sequence of the PGR fragment 
(see Example 1) 

Figure 5 A and 5B show alignments of the protein sequence of the A5- 
1 5 desaturase with A6-desaturases. 

Figure 6A and 6B show the effect of the timing of substrate addition 
relative to induction on conversion of substrate to product in SC334 containing 
the AS-desaturase gene. 

Figure 7A and 7B show the effect of inducer concentration on A5- 
20 desaturase expression in SC334. 

Figure 8 A and 8B show the effect of induction temperature on AS- 
desaturase activity in SC334. 

Figure 9 A and 9B show the effect of host strain on the conversion of 
substrate to product in strains expressing the AS-desaturase gene at IS^'C, 
25 Figure 1 OA and 1 OB show the effect of host stram on the conversion of 

substrate to product in strains expressing the AS-desaturase gene at 30^C. 
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Figure 1 1 shows the efTect of a host strain expressing choline transferase 
as well as the A5-desaturase gene on the conversion of substrate to product. 

Figure 12A and 12B show the effect of media composition and 
temperature on the conversion of substrate to product in two host strains 
expressing the A5-desaturase gene. 

Figure 13 shows alignment of the protein sequence of Ma 29 and contig 
253538a. 

Figure 14 shows alignment of the protein sequence of Ma 524 and 
contig 253538a. 

Brief Description of the Sequence Listings 



SEQ ID NO:l shows a DNA sequence of the Mortierella alpina A5- 
desaturase. 

SEQ ID NO:2 shows an ammo acid sequence of Mortierella alpina A5- 
15 . desaturase. 

SEQ ID NO: 3 shows the deduced amino acid sequence of the M alpina 
PGR fragment (see Example 1). 

SEQ ID NO: 4 - SEQ ID NO: 7 show the deduced amino acid sequences 
of various A6-desaturases. 
20 SEQ ID NO: 8 and SEQ ID NO: 9 show PGR primer sequences for A6- 

desaturases 

SEQ ID NO: 10 shows a primer for reverse transcription of total RNA. 

SEQ ID NO: 1 1 and SEQ ID NO: 12 show amino acid motifs for 
desaturase sequences. 
25 SEQ ID NO: 13 and SEQ ID NO: 14 show the nucleotide and amino 

acid sequence of sl Dictyostelium discoideum desaturase sequence. 
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SEQ ID NO: 15 and SEQ ID NO: 16 show the nucleotide and amino 
acid sequence of a Phaeodactylum tricornutum desaturase sequence. 

SEQ ID NO: 17-20 show the nucleotide and deduced amino acid 
sequence of a Schizochytrium cDNA clone. 

SEQ ID NO: 21-27 show nucleotide sequences for human desaturases. 

SEQ ID NO: 28 - SEQ ID NO: 34 show peptide sequences for human 
desaturases. 



Detailed Description of the Invention 

10 In order to ensure a complete imderstanding of the invention, the 

following definitions are provided: 

A5-Desaturase: A5 desaturase is an enzyme which introduces a double 
bond between carbons 5 and 6 from the carboxyl end of a fatty acid molecule. 

A6-Desaturase: A6-desaturase is an enzyme which introduces a double 
1 5 bond between carbons 6 and 7 from the carboxyl end of a fatty acid molecule. 

A9-Desaturase: A9-desaturase is an enzyme which introduces a double 
bond between carbons 9 and 10 from the carboxyl end of a fatty acid molecule. 

A12-Desaturase: A12-desaturase is an enzyme which introduces a 
double bond between carbons 12 and 13 from the carboxyl end of a fatty acid 
20 molecule. 

Fatty Acids: Fatty acids are a class of compounds containing a long 
hydrocarbon chain and a terminal carboxylate group. Fatty acids include the 
following: 



Fatty Acid 


12:0 


lauric acid 




16:0 


palmitic acid 




16:1 


palmitoieic acid 
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Fatty Acid 


18:0 


stearic acid 




18:1 


oleic acid 




18:2 A5,9 


taxoteic acid 




18:2 A6,9 


6,9-octadecadienoic acid 




18:2 


inoleic acid 


An to tQ«0 /f A\ 

A9»12-lo»Z \L,A,) 


18:3A6.9,12 


gamma-linolenic acid 


A6,9,12-18:3 (GLA) 


18:3 A5,9,12 


pinolenic acid 


A5,9,12-18:3 


18:3 


alpha-linolenic acid 


A9,12,15-18:3 (ALA) 


18:4 


stearidonic acid 


A6,9,12.15-18:4 (SDA) 


20:0 


Arachidic acid 




OA' 1 
ZU. 1 


Picn^cenic Acid 






behehic acid 




07.' \ 


enxcic acid 






Docasadienoic acid 




20:4 co6 


arachidonic acid 


A5,8,ll, 14-20:4 (ARA) 


20:3 0)6 


Ck)6-eicosatrienoic 
dihomo-gamma linolenic 


A8,l 1,14-20:3 (DGLA) 


20:5 a>3 


Eicosapentanoic 
(Timnodonic acid) 


A5.8,l 1,14,17-20:5 (EPA) 


20:3 a>3 


Ci>3-eicosatrienoic 


Al 1,16,17-20:3 


20:4 a>3 


a)3-eicosatetraenoic 


A8,l 1,14,17-20:4 


22:5 0)3 


Docasapentaenoic 


A7,10,13,16,19-22:5 (a>3DPA) 


22:6 a>3 


Docosahexaenoic 
(cervonic acid) 


A4,7,10,13,16,19-22:6 (DHA) 


24:0 


Lignoceric acid 





Takii% into account these definitions, the present invention is directed tc 
novel DNA sequences, DNA constructs, methods and compositions are 
provided which permit modification of the poly-unsaturated long chain fatty 
5 acid contoit of, for example, microbial cells or animals. Host cells are 

manipulated to express a sense or antisense transcript of a DNA encoding a 
polypeptide(s) which catalyzes the conversion of DGLA to ARA. The 
substrate(s) for the expressed en^rme may be produced by the host cell or may 
be exogenously supplied. To achieve expression, the transformed DNA is 
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operably associated with transcriptional and translational initiation and 
termination regulatory regions that are functional in the host cell. Constructs 
comprising the gene to be expressed can provide for integration into the genome 
of the host cell or can autonomously replicate in the host cell. For production 
5 of ARA, the expression cassettes generally used include a cassette which 

provides for A5-desaturase activity, particularly in a host cell which produces or 
can take up DGLA. Production of <o6-type unsaturated fetty acids, such as 
ARA, is favored in a host microorganism or animal which is substantially free 
of ALA. The host is selected or obtained by removing or inhibiting activity of a 
10 A15- or <o3- type desaturase (see Figure 2). Hie endogenous desaturase activity 
can be affected by providmg an expression cassette for an antisense A15 or <o3 
transcript, by disrupting a target Al 5- or <D3-desaturase gene through insertion, 
substitution and/or deletion of all or part of the target gene, or by adding a Al 5- 
or o>3-desaturase inhibitor. Production of LA also can be increased by " 
1 5 providing expression cassettes for A9 and/or A12-desaturases where tiieir 
respective enzymatic activities are limiting. 



MICROBIAL PRODUCTION OF FATTY ACIDS 

Microbial production of fetty acids has several advantages over 
20 purification from natural sources such as fish or plants. Many microbes are 
known with greatiy simplified oil compositions compared with tiiose of higher 
organisms, making purification of desired components easier. Microbial 
production is not subject to fluctuations caused by ext«nal variables such as 
weather and food supply. Microbially produced oil is substantially finee of 

25 contamination by environmental pollutants. Additionally, microbes can provide 
PUFAs in particular forms which may have specific uses. For example, 
Spirulina can provide PUFAs predominantiy at the first and tiiird positions of 
triglycerides; digestion by pancreatic lipases preferentially releases fetty acids 
from these positions. Following human or animal ingestion of triglycorides 

30 derived from Spirulina, tiiese PUFAs are released by pancreatic lipases as fiiee 

-15- 



wo 98/46765 



PCT/US98/07422 



fatty acids and thus are directly available, for example, for infant brain 
development Additionally, microbial oil production can be manipulated by 
controlling culture conditions, notably by providing particular substrates for 
microbially caressed enzymes, or by addition of compounds which suppress 

5 undesired biochemical pathways. In addition to these advantages, production of 
fatty acids from recombinant microbes provides the ability to alter the naturally 
occurring microbial fetty acid profile by providing new synthetic pathways in 
the host or by suppressing undesired pathways, thereby increasing levels of 
desired PUFAs, or conjugated forms thereof, and decreasing levels of undesired 

10 PUFAs. 

PRODUCTION OF FATTY ACIDS IN ANIMALS 

Production of fatty acids in animals also presents several advantages. 
Expression of desattirase genes in animals can produce greatly increased levels 
of desired PUFAs in animal tissues, making recovery from those tissues more 
1 5 economical. For example, where the desired PUFAs are expressed in the breast 
milk of animals, methods of isolating PUFAs from animal milk are well 
established. In addition to providing a source for purification of desired 
PUFAs, animal breast milk can be manipulated through expression of 
desaturase genes, either alone or in combination with other human genes, to 
20 provide animal milks with a PUF A composition substantially similar to human 
breast milk during the different stages of infant development. Humanized 
animal milks could serve as infant fonnulas vs^ere human nursing is impossible 
or undesired, or in cases of malnourishment or disease. 

Depending upon the host cell, the availability of substrate, and the 
25 desired end product(s), several polypeptides, particularly desaturases, are of 

interest. By "desaturase" is intended a polypeptide which can desaturate one or 
more fatty acids to produce a mono- or poly-unsaturated fatty acid or precursor 
thereof of interest. Of particular interest are polypeptides which can catalyze 
the conversion of DGLA to produce ARA which includes enzymes which 
30 desaturate at the A5 position. By "polypeptide" is meant any chain of amino 
acids, regardless of length or post-translational modification, for example, 
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glycosylation or phosphorylation. Considerations for choosing a specific 
polypeptide having desaturase activity include the pH optimum of the 
polypeptide, whether the polypeptide is a rate limiting enzyme or a component 
thereof, whether the desaturase used is essential for synthesis of a desired poly- 
5 unsaturated fatty acid, and/or co-factors required by the polypeptide. The 
expressed polypeptide preferably has parameters compatible with the 
biochemical environment of its location in the host cell. For example, the 
polypeptide may have to compete for substrate with other enzymes in the host 
cell. Analyses of the K™ and specific activity of the polypeptide in question 
1 0 therefore are considered in determining the suitability of a given polypeptide for 
modifying PUFA production in a given host cell. The polypeptide used in a 
particular situation is one which can fimction under the conditions present in the 
intended host cell but otherwise can be any polypeptide having desaturase 
activity v\Wch has the desired characteristic of being capable of modifying the 
1 5 relative production of a desired PUFA. 

For production of ARA, the DNA sequence used encodes a polypeptide 
having A5-desaturase activity. In particular instances, this can be coupled with 
an expression cassette which provides for production of a polypeptide having 
A6-desaturase activity and the host cell can optionally be depleted of any Al 5- 
20 desaturase activity present, for example by providing a transcription cassette for 
production of antisense sequences to the A15-desaturase transcription product, 
by disrupting the A15-desaturase gene, or by using a host cell which naturally 
has, or has been mutated to have, low A15-desaturase activity. Inhibition of 
undesired desaturase pathways also can be accomplished through the use of 
25 specific desaturase inhibitors such as those described in U.S. Patent No. 

4,778,630. The choice of combination of cassettes used can depend in part on 
the PUFA profile of the host cell. Where tiie host cell A5-desaturase activity is 
limiting, overexpression of A5-desaturase alone generally will be sufficient to 
provide for enhanced ARA production in the presence of an appropriate 
30 substrate such as DGLA. ARA production also can be increased by providing 
expression cassettes for A9- or A12-desaturase genes when tije activities of 
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those desaturases are limiting. A scheme for the synthesis of arachidonic acid 
(20:4 A^* * from palmitic acid (Cie) is shown in Figure 1 . A key enzyme in 
this pathway is a A5-desaturase which converts DH-y-linolenic acid (DQLA, 
eicosatrienoic acid) to ARA. Conversion of a-linolenic acid (ALA) to 
5 stearidonic acid by a A6-desaturase is also shown. Production of PUFAs in 
addition to ARA, including EPA and DHA is shown in Figure 2. 

SOURCES OF FOLYPEPXroES 
HAVING DESATURASE ACTIVITY 
A source of polypeptides having desaturase activity and oligonucleotides 
10 encoding such polypeptides are organisms which produce a desired poly- 
unsaturated fatty acid. As an example, microorganisms having an ability to 
produce ARA can be used as a source of A5-desaturase activity. Such 
microorganisms include, for example, those belonging to the genera 
Mortierella, Conidiobolus, Pythium, Phytophathora, Penicillium, 
15 Porphyridium. Coidosporium, Mucor, Fusarium, Aspergillus, Rhodotorula, and 
Entomophthora. Within the genus Porphyridium, of particular interest is 
Porphyridium cruentum. Within the genus Mortierella, of particular interest are 
Mortierella elongata, Mortierella exigua, Mortierella hygrophila, Mortierella 
ramanniana, var. angulispora, and Mortierella alpina. Within the genus Mucor , 
20 of particular interest are Mucor circinelloides and Mucor Javanicus. 

DNAs encoding desired desaturases can be identified in a variety of 
ways. As an example, a source of the desired desaturase, for example genomic 
or cDNA libraries from Mortierella, is screened with detectable enzymatically- 
or chemically-synthesized probes, which can be made from DNA, RNA, or non- 
25 naturally occurring nucleotides, or mixtures thereof. Probes may be 

enzymatically synthesized from DNAs of known desaturases for normal or 
reduced-stringency hybridization methods. Oligonucleotide probes also can be 
used to screen sources and can be based on sequences of known desaturases, 
including sequences conserved among known desaturases, or on peptide 
30 sequences obtained from the desired purified protein. Oligonucleotide probes 

based on amino acid sequences can be degenerate to encompass the degeneracy 
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of the genetic code, or can be biased in favor of the preferred codons of the 
source organism. Oligonucleotides also can be used as primers for PGR from 
reverse transcribed mRNA from a known or suspected source; the PGR product 
can be the full length cDNA or can be used to generate a probe to obtain the 
5 desired full length cDNA. Alternatively, a desired protein can be entirely 

sequenced and total synthesis of a DNA encoding that polypeptide performed. 

Once the desired genomic or cDNA has been isolated, it can be 
sequenced by known methods. It is recognized in the art that such methods are 
subject to errors, such that multiple sequencing of the same region is routine and 
10 is stUl expected to lead to measurable rates of mistakes in the resulting deduced 
sequence, particularly in regions having repeated domains, extensive secondary 
structure, or unusual base compositions, such as regions with high QC base 
content. When discrepancies arise, resequencing can be done and can employ 
special methods. Special methods can include altering sequencing conditions 
1 5 by using: different temperatures; different en2ymes; proteins which alter the 
ability of oUgonucleotides to form higher order structures; altered nucleotides 
such as rrP or methylated dGTP; different gel compositions, for example 
adding foimamide; different primers or primers located at different distances 
from the problem region; or different templates such as single stranded DNAs. 
20 Sequencing of mRNA also can be employed. 

For the most part, some or all of tiie coding sequence.for the polypeptide 
having desaturase activity is from a natural source. In some situations, 
however, it is desirable to modify all or a portion of tiie codons, for example, to 
enhance ejqiression, by onploying host preferred codons. Host preferred 
25 codons can be determined from tiie codons of highest firequency in the proteins 
expressed in the largest amount in a particular host species of interest. Thus, the 
coding sequence for a polypeptide having desaturase activity can be 
synthesized in whole or in part. All or portions of the DNA also can be 
synthesized to remove any destabilizing sequences or regions of secondary 
30 structure which would be present in the transcribed mRNA. All or portions of 
the DNA also can be synthesized to alter the base composition to one more 
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preferable in the desired host cell. Methods for synthesizing sequences and 
bringing sequences together are well established in the literature. In vitro 
mutagenesis and selection, site-directed mutagenesis, or other means can be 
employed to obtain mutations of naturally occurring desaturase genes to 
5 produce a polypeptide having desaturase activity in vivo with more desirable 
physical and kinetic parameters for function in the host cell, such as a longer 
half-life or a higher rate of production of a desired polyunsaturated fatty acid. 

Mortierella alp ina Desaturase 

Of particular interest is the Mortierella alpina A5-desaturase which has 
10 446 amino acids; the amino acid sequence is shown in Figure 3 . The gene 

encoding tiie Mortierella alpina A5-desaturase can be expressed in transgenic 
microorganisms or animals to effect greater syntiiesis of ARA from DGLA. 
Other DNAs which are substantially identical to tiie Mortierella alpina A5- 
desaturase DN A, or which encode polypeptides which are substantially identical 
15 to tiie Mortierella alpina A5-desaturase polypeptide, also can be used. By 
substantially identical is intended an amino acid sequence or nucleic acid 
■ sequence exhibiting in order of increasing preference at least 60%, 80%. 90% or 
95% homology to tiie Mortierella alpina A5-desaturase amino acid sequence or 
nucleic acid sequence encoding the amino acid sequence. For polypeptides, tiie 
20 lengtii of comparison sequences generally is at least 1 6 amino acids, preferably 
at least 20 amino acids, or most preferably 35 amino acids. For nucleic acids, 
tiie lengtii of comparison sequences generally is at least 50 nucleotides, 
preferably at least 60 nucleotides, and more preferably at least 75 nucleotides, 
and most preferably, 1 10 nucleotides. Homology typically is measured using 
25 sequence analysis software, for example, tiie Sequence Analysis software 
package of the Genetics Computer Group, University of Wisconsin 
Biotechnology Center, 1710 University Avenue, Madison, Wisconsin 53705, 
MEGAUgn (DNAStar, Inc., 1228 S. Park St., Madison, Wisconsin 53715), and 
MacVector (Oxford Molecular Group, 2105 S. Bascom Avenue, Suite 200, 
30 Campbell, California 95008). Such software matches similar sequences by 
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assigning degrees of homology to various substitutions, deletions, and other 
modifications. Conservative substitutions typically include substitutions within 
the following groups: glycine and alanine; valine, isoleucine and leucine; 
aspartic acid, glutamic acid, asparagine, and glutamine; serine and threonine; 
5 lysine and arginine; and phenylalanine and tyrosine. Substitutions may also be 
made on the basis of conserved hydrophobicity or hydrophilicity (Kyte and 
DoolitUe, J. Mol. Biol. 157: 105-132, 1982), or on the basis of the ability to 
assume similar polypeptide secondary structure (Chou and Fasman, Adv. 
EnzymoL 47: 45-148, 1978). 

\Q Other Desaturases 

Encompassed by tiie present invention are related desaturases from tiie 
same or other organisms. Such related desaturases include variants of tiie 
disclosed A5-desaturase naturally occurring within the same or different species 
of Mortierella, as well as homologues of tiie disclosed A5-desaturase from otiier 
1 5 species. Also included are desatiirases which, altiiough not substantially 
identical to tiie Mortierella alpina A5-desatoirase, desaturate a fetty acid 
molecule at carbon 5 from die carboxyl end of a fetty acid molecule. Related 
desaturases can be identified by tiieir ability to function substantially tiie same 
as tiie disclosed desaturases; tiiat is, are still able to effectively convert DGLA 
20 to ARA. Related desaturases also can be identified by screening sequence 
databases for sequences homologous to the disclosed desaturase, by 
hybridization of a probe based on the disclosed desaturase to a library 
constructed from tiie source organism, or by RT-PCR using mRNA from tiie 
source organism and primers based on the disclosed desaturase. Such 
25 desaturases include tiiose from humans, Dictyostelium discoideum and 
Phaeodactylum tricornum. 

The regions of a desaturase polypeptide important for desaturase activity 
can be determined through routine mutagenesis, expression of the resulting 
mutant polypeptides and determination of their activities. Mutants may include 
30 deletions, insertions and point mutations, or combinations thereof. A typical 
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functional analysis begins with deletion mutagenesis to determine the N- and C- 
tenninal limits of the protein necessary for function, and then internal deletions, 
insertions or point mutants are made to further determine regions necessary for 
function. Other techniques such as cassette mutagenesis or total synthesis also 
5 can be used. Deletion mutagenesis is accomplished, for example, by using 
exonucleases to sequentially remove the 5' or 3' coding regions. Kits are 
available for such techniques. After deletion, the coding region is completed by 
ligating oligonucleotides containing start or stop codons to the deleted coding 
region after 5' or 3' deletion, respectively. Alternatively, oligonucleotides 
1 0 encoding start or stop codons are inserted into the coding region by a variety of 
methods including site-directed mutagenesis, mutagenic PGR or by ligation 
onto DNA digested at existing restriction sites. Internal deletions can similarly 
be made through a variety of methods including the use of existing restriction 
sites in the DNA, by use of mutagenic primers via site directed mutagenesis or 
1 5 mutagenic PGR. Insertions are made through methods such as linker-scanning 
mutagenesis, site-directed mutagenesis or mutagenic PGR. Point mutations are 
made through techniques such as site-directed mutagenesis or mutagenic PGR. 

Chemical mutagenesis also can be used for identifying regions of a 
desaturase polypeptide important for activity. A mutated construct is expressed, 
20 and the ability of flie resulting altered im>tein to function as a desaturase is 

assayed. Such structure-function analysis can determine which regions may be 
deleted, which regions tolerate insertions, and which point mutations allow the 
mutant protein to function in substantially the same way as the native 
desaturase. All such mutant proteins and nucleotide sequences encoding them 
25 are within the scope of the present invention. 

EXPRESSION OF DESATURASE GENES 
Once the DNA encoding a desaturase polypeptide has been obtained, it 
is placed in a vector capable of replication in a host cell, or is propagated in 
vitro by means of techniques such as PGR or long PGR. Replicating vectors 
30 can include plasmids, phage, viruses, cosmids and the like. Desirable vectors 
include those useful for mutagenesis of the gene of interest or for expression of 
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the gene of interest in host cells. The technique of long PGR has made in vitro 
propagation of large constructs possible, so that modifications to the gene of 
interest, such as mutagenesis or addition of expression signals, and propagation 
of the resulting constructs can occur entirely in vitro without the use of a 

5 replicating vector or a host cell. 

For expression of a desaturase polypeptide, functional transcriptional 
and translational initiation and termination regions are operably linked to the 
DNA encoding the desaturase polypeptide. Expression of the polypeptide 
coding region can take place in vitro or in a host cell. Transcriptional and 

1 0 translational initiation and termination regions are derived from a variety of 
nonexclusive sources, including the DNA to be expressed, genes known or 
suspected to be capable of expression in the desired system, expression vectors, 
chemical synthesis, or from an endogenous locus in a host cell. 

Expression In Vitro 

15 In vitro expression can be accomplished, for example, by placing the 

coding region for the desaturase polypeptide in an expression vector designed 
for in vitro use and adding rabbit reticulocyte lysate and cofactors; labeled 
amino acids can be incorporated if desired. Such in vitro expression vectors 
may provide some or all of the expression signals necessary in the system used. 

20 These methods are well known in the art and the components of the system are 
commercially available. The reaction mixture can then be assayed directly for 
the polypeptide, for example by determining its activity, or the synthesized 
polypeptide can be purified and then assayed. 

Expression In A Host Cell 

25 Expression in a host cell can be accomplished in a transient or stable 

fashion. Transient expression can occur from introduced constructs which 

contain expression signals functional in the host cell, but which constructs do 

not replicate and rarely integrate in the host cell, or where the host cell is not 

proliferating. Transient expression also can be accomplished by inducing the 

-23- 



wo 98/46765 



PCTAJS98/07422 



activity of a regulatable promoter operably linked to the gene of interest, 
although such inducible systems frequently exhibit a low basal level of 
expression. Stable expression can be achieved by introduction of a construct 
that can integrate into the host genome or that autonomously replicates in the 
5 host cell. Stable expression of the gene of interest can be selected for through 
the use of a selectable marker located on or transfected with the expression 
construct, followed by selection for cells expressing the marker. When stable 
expression results from integration, integration of constructs can occur 
randomly within the host genome or can be targeted through the use of 
10 constructs containing regions of homology with the host genome sufficient to 
target recombination with the host locus. Where constructs are targeted to an 
endogenous locus, all or some of the transcriptional and translational regulatory 
regions can be provided by the endogenous locus. 

When increased expression of the desaturase polypeptide in the source 
1 5 organism is desired, several methods can be employed. Additional genes 

encoding the desaturase polypeptide can be introduced into the host organism. 
Expression from the native desaturase locus also can be increased through 
homologous recombination, for example by inserting a stronger promoter into 
the host genome to cause increased expression, by removing destabilizing 
20 sequences from either the mRN A or the encoded protein by deleting that 

information from the host genome, or by adding stabilizing sequences to the 
mRNA (USPN 4,910.141). 

When it is desirable to express more than one different gene, appropriate 
regulatory regions and expression methods, introduced genes can be propagated 
25 in the host cell through use of replicating vectors or by integration into the host 
genome. Where two or more genes are expressed from separate replicating 
vectors, it is desirable that each vector has a dififerent means of replication. 
Each introduced construct, whether integrated or not, should have a different 
means of selection and should lack homology to the other constructs to maintain 
30 stable expression and prevent reassortment of elements among constructs. 
Judicious choices of regulatory regions, selection means and method of 
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propagation of the introduced construct can be experimentally detemiined so 
that all introduced genes are expressed at the necessary levels to provide for 
synthesis of the desired products. 

As an example, where the host cell is a yeast, transcriptional and 

5 translational regions functional in yeast cells are provided, particularly from the 
host species. The transcriptional initiation regulatory regions can be obtained, 
for example from genes in the glycolytic pathway, such as alcohol 
dehydrogenase, glyceraldehyde-3-phosphate dehydrogenase (GPD), 
phosphoglucoisomerase, phosphoglycerate kinase, etc. or regulatable genes 

10 such as acid phosphatase, lactase, metallothionein, glucoamylase, etc. Any one 
of a number of regulatory sequences can be used in a particular situation, 
depending upon whether constitutive or induced transcription is desired, the 
particular efficiency of the promoter in conjunction with the open-reading firame 
of interest, the ability to join a strong promoter with a control region from a 

1 5 different promoter which allows for inducible transcription, ease of 

construction, and the like. Of particular interest are promoters which are 
activated in the presence of galactose. Galactose-inducible promoters (GAL 1 , 
GAL7, and GALIO) have been extensively utilized for high level and regulated 
expression of protein in yeast (Lue ei al.Mol Cell Biol Vol. 7, p. 3446, 1987; 

20 Johnston, Microbiol Rev. Vol. 51, p. 458, 1987). Transcription from the GAL 
promoters is activated by the GAL4 protein, which binds to the promoter region 
and activates transcription when galactose is present. In the absence of 
galactose, the antagonist GAL80 binds to GAL4 and prevents GAL4 from 
activating transcription. Addition of galactose prevents GAL80 ftom inhibiting 

25 activation by GAL4. 

Nucleotide sequences surrounding the translational initiation codon 
ATG have been found to affect expression in yeast cells. If the desired 
polypet>tide is poorly expressed in yeast, the nucleotide sequences of exogenous 
genes can be modified to include an efficient yeast translation initiation 
30 sequence to obtain optimal gene expression. For expression in Saccharomyces^ 
this can be done by site-directed mutagenesis of an inefficiently expressed gene 
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by fusing it in-frame to an endogenous Saccharomyces gene, preferably a highly 

expressed gene, such as the lactase gene. 

The termination region can be derived from the 3' region of the gene 

from which the initiation region was obtained or from a different gene. A large 
5 number of termination regions are known to and have been found to be 

satisfactory in a variety of hosts from the same and different genera and species. 

The termination region usually is selected more as a matter of convenience 

rather than because of any particular property. Preferably, the termination 

region is derived from a yeast gene, particularly Saccharomyces^ 
10 Schizosaccharomyces^ Candida or Kluyveromyces. The 3' regions of two 

mammalian genes, y interferon and a2 interferon, are also known to frmction in 

yeast. 

INTRODUCTION OF CONSTRUCTS INTO HOST CELLS 

Constructs comprising the gene of interest may be introduced into a host 
1 5 cell by standard techniques. These techniques include transformation, 

protoplast fiision, lipofection, transfection, transduction, conjugation, infection, 
holistic impact, electroporation, microinjection, scraping, or any other method 
which introduces the gene of interest into the host cell. Methods of 
transformation which are used include lithium acetate transformation {Methods 
20 in Enzymology, Vol. 194, p. 186-187, 1991). For convenience, a host cell which 
has been manipulated by any method to take up a DN A sequence or construct 
will be referred to as "transformed" or "recombinant" herein. 

The subject host will have at least have one copy of the expression 
construct and may have two or more, depending upon whether the gene is 

25 integrated into the genome, amplified, or is present on an extrachromosomal 
element having multiple copy numbers. Where the subject host is a yeast, four 
principal types of yeast plasmid vectors can be used: Yeast Integrating plasmids 
(YIps), Yeast Replicating plasmids (YRps), Yeast Centromere plasmids 
(YCps), and Yeast Episomal plasmids (YEps). Yips lack a yeast replication 

30 origin and must be propagated as integrated elements in the yeast genome. 

YRps have a chromosomally derived autonomously replicating sequence and 
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aie propagated as medium copy number (20 to 40), autonomously replicating, 
unstably segregating plasmids. YCps have both a replication origin and a 
centromere sequence and propagate as low copy number (10-20), autonomously 
replicating, stably segregating plasmids. YEps have an origin of replication 
5 from the yeast 2\xm plasmid and are propagated as high copy number, 

autonomously replicating, irregularly segregating plasmids. The presence of the 
plasmids in yeast can be ensured by maintaining selection for a marker on the 
plasmid. Of particular interest are the yeast vectors pYES2 (a YEp plasmid 
available fit)m Invitrogen, confers uracil prototrophy and a GALl galactose- 
1 0 inducible promoter for expression), pRS425-pGl (a YEp plasmid obtained from 
Dr. T. H. Chang, Ass. Professor of Molecular Genetics, Ohio State University, 
containing a constitutive GPD promoter and conferring leucine prototrophy), 
and pYX424 (a YEp plasmid having a constitutive TPl promoter and conferring 
leucine prototrophy; Alber, T. and Kawasaki, G. (1982). J. MoL & AppL 
15 Genetics 1:419). 

The transformed host cell can be identified by selection for a marker 
contained on the introduced construct. Alternatively, a separate marker 
construct may be introduced with the desired construct, as many transformation 
techniques introduce many DNA molecules into host cells. Typically, 
20 transformed hosts are selected for their ability to grow on selective media. 
Selective media may incorporate an antibiotic or lack a factor necessary for 
growth of the untransformed host, such as a nutrient or growth factor. An 
introduced marker gene therefor may confer antibiotic resistance, or encode an 
essential growth factor or enzyme, and permit growth on selective media when 
25 expressed in the transformed host Selection of a transformed host also can 
occur when the expressed marker protein can be detected, either directly or 
indirectly. The marker protein may be expressed alone or as a frision to another 
protein. The marker protein can be detected by its enzymatic activity; for 
example P galactosidase can convert the substrate X-gal to a colored product, 
30 and luciferase can convert luciferin to a light-emitting product. The marker 
protein can be detected by its light-producing or modifying characteristics; for 
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example, the green fluorescent protein of Aequorea victoria fluoresces when 
illuminated with blue hght Antibodies can be used to detect the marker 
protein or a molecular tag on, for example, a protein of interest. Cells 
expressing the marker protein or tag can be selected, for example, visually, or 
5 by techniques such as FACS or panning using antibodies. For selection of yeast 
transformants, any marker that functions in yeast may be used. Desirably, 
resistance to kanamycin and the amino glycoside G418 are of interest, as well as 
ability to grow on media lacking uracil, leucine, lysine or tryptophan. 

The A5-desaturase-mediated production of PUFAs can be performed in 
1 0 either prokaryotic or eukaryotic host cells. Prokaryotic cells of interest include 
Eschericia, Bacillus, Lactobacillus, cyanobacteria and the like. Eukaryotic 
cells include mammalian cells such as those of lactating animals, avian cells 
such as of chickens, and other cells amenable to genetic manipulation includmg 
insect, fungal, and algae cells. The cells may be cultured or formed as part or 
15 all of a host organism including an animal. Viruses and bacteriophage also may 
be used with the cells in the production of PUFAs, particularly for gene transfer, 
cellular targeting and selection. In a preferred embodiment, the host is any 
- microorganism or anin^al which produces DGLA and/or can assimilate 
exogenously supplied DGLA, and preferably produces large amounts of DGLA. 
Examples of host animals include mice, rats, rabbits, chickens, quail, turkeys, 
bovines, sheep, pigs, goats, yaks, etc., which are amenable to genetic 
manipulation and cloning for rapid expansion of the transgene expressing 
population. For animals, a A5-desaturase transgene can be adapted for 
expression in target organelles, tissues and body fluids through modification of 
the gene regulatory regions. Of particular interest is the production of PUFAs 
in the breast milk of the host animal. 

Expression In Yeast 

Examples of host microorganisms include Saccharomyces cerevisiae, 
Saccharomyces carlsbergensis, or other yeast such as Candida, Kluyveromyces 
or other fungi, for example, filamentous fimgi such as Aspergillus, Neurospora, 
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Penicillium^ etc. Desirable characteristics of a host microorganism are, for 
example, that it is genetically well characterized, can be used for high level 
expression of the product using ultra-high density fermentation, and is on the 
GRAS (generally recognized as safe) list since the proposed end product is 
5 intended for ingestion by humans. Of particular interest is use of a yeast, more 
particularly baker's yeast (S. cerevisiae), as a cell host in the subject invention. 
Strains of particular interest are SC334 (Mat a pep4-3 prbl-l 122 ura3-52 leu2- 
3, 1 12 regl-501 gall; Gene 83:57-64, 1989, Hovland P. et al.), YTC34 (a ade2- 
101 his3A200 lys2-801 ura3-52; obtained from Dr, T. H. Chang, Ass. Professor 
1 0 of Molecular Genetics, Ohio State University), YTC41 (a/a ura3-52/ura3=52 
Iys2-801/lys2.801 ade2-101/ade2-101 trpl-Al/trpl-Al his3A200/his3A200 
leu2Al/leu2Al; obtained from Dr. T. H. Chang, Ass. Professor of Molecular 
Genetics, Ohio State University), BJ1995 (obtained from the Yeast Genetic 
Stock Centre, 1021 Donner Laboratory, Berkeley, CA 94720), INVSCl (Mat a 
1 5 hiw3 Al leu2 trpl -289 ura3-52; obtained from Invitrogen, 1 600 Faraday Ave., 
Carlsbad, CA 92008) and INVSC2 (Mat a his3A200 ura3-167; obtained from 
Invitrogen). 

Kitprcssion In Avian Species 

For producing PUFAs in avian species and cells, such as chickens, 

20 turkeys, quail and ducks, gene transfer can be performed by introducing a 

nucleic acid sequence encoding a AS-desaturase into the cells following 

procedures known in the art. If a transgenic animal is desired, pluripotent stem 

cells of embryos can be provided with a vector carrying a A5-desaturase 

encoding transgene and developed into adult animal (USPN 5,162,215; Ono et 

25 al. (1996) Comparative Biochemistry and Physiology A 775(3):287-292; WO 

9612793; WO 9606160). In most cases, the transgene will be modified to 

express high levels of the desaturase in order to increase production of PUFAs. 

The transgene can be modified, for example, by providing transcriptional and/or 

translational regulatory regions that frinction in avian cells, such as promoters 

30 which direct expression in particular tissues and egg parts such as yolk. The 
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gene regulatory regions can be obtained from a variety of sources, including 
chicken anemia or avian leukosis viruses or avian genes such as a chicken 
ovalbumin gene. 

Expressio n In Insect Cells 

5 Production of PUF As in insect cells can be conducted using baculovirus 

expression vectors harboring a A5-desaturase transgene. Baculovirus 
expression vectors are available from several commercial sources such as 
Clonetech. Methods for producing hybrid and transgenic strains of algae, such 
as marine algae, which contain and express a desaturase transgene also are 

10 provided. For example, transgenic marine algae may be prepared as described 
in USPN 5,426,040. As with the other expression systems described above, the 
timing, extent of expression and activity of the desaturase transgene can be 
regulated by fitting the polypeptide coding sequence with the appropriate 
transcriptional and translational regulatory regions selected for a particular use. 

1 5 Of particular interest are promoter regions which can be induced imder 
preselected growth conditions. For example, introduction of temperature 
sensitive and/or metabolite responsive mutations into the desaturase transgene 
coding sequences, its regulatory regions, and/or the genome of cells into which 
the transgene is introduced can be used for this purpose. 

20 Expression In Plants 

Production of PUFA's in plants can be conducted using various plant 
transformation systems such as the use of Agrobacterium tumefaciens, plant 
viruses, particle cell transformation and the like which are disclosed in 
Applicant's related applications U.S. Application Serial Nos. 08/834,033 and 
25 08/956,985 and continuation-in-part applications filed simultaneously with this 
application all of which are hereby incorporated by reference. 

The transformed host cell is grown under appropriate conditions adapted 
for a desired end result. For host cells grown in culture, the conditions are 
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typically optimized to produce the greatest or most economical yield of PUFAs, 
which relates to the selected desaturase activity. Media conditions which may 
be optimized include: carbon source, nitrogen source, addition of substrate, 
final concentration of added substrate, form of substrate added, aerobic or 
5 anaerobic growth, growth temperature, inducing agent, induction temperature, 
growth phase at induction, growth phase at harvest, pH, density, and 
maintenance of selection. Microorganisms such as yeast, for example, are 
preferably grown using selected media of interest, which include yeast peptone 
broth (YPD) and mmimal media (contains amino acids, yeast nitrogen base, and 
1 0 ammonium sulfate, and lacks a component for selection, for example uracil). 
Desirably, substrates to be added are first dissolved in ethanol. Where 
necessary, expression of the polypeptide of interest may be induced, for 
example by including or adding galactose to induce expression from a GAL 
promoter. 

15 Expressio n In An Animal 

Expression in cells of a host animal can likewise be accomplished in a 
transient or stable manner. Transient expression can be accomplished via known 
methods, for example infection or lipofection, and can be repeated m order to 
maintain desired expression levels of the introduced construct (see Ebert, PCT 

20 publication WO 94/05782). Stable expression can be accomplished via 

integration of a constract into the host genome, resultmg in a transgenic animal. 
The construct can be introduced, for example, by microinjection of the construct 
into the pronuclei of a fertilized egg, or by transfection, retroviral infection or 
other techniques whereby the construct is introduced into a cell line which may 

25 form or be incorporated into an adult animal (U.S. Patent No. 4,873,191 ; U.S. 
Patent No. 5,530,177; U.S. Patent No. 5,565,362; U.S. Patent No. 5,366,894; 
Wilmut et al (1997) Nature 385:810). The recombinant eggs or embryos are 
transferred to a surrogate mother (U.S. Patent No. 4,873,191 ; U.S. Patent No. 
5,530,177; U.S. Patent No. 5,565,362; U.S. Patent No. 5,366,894; Wilmut et al, 

30 (supra)). 
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After birth, transgenic animals are identified, for example, by the 
presence of an introduced marker gene, such as for coat color, or by PGR or 
Southern blotting from a blood, milk or tissue sample to detect the introduced 
construct, or by an immunological or enzymological assay to detect the 
5 expressed protein or the products produced therefrom (U .S. Patent No. 

4,873,191; U.S. Patent No. 5,530,177; U.S. Patent No. 5,565,362; U.S. Patent 
No. 5,366,894; Wihnut et al (supra)). The resulting transgenic animals may be 
entirely transgenic or may be mosaics, having the transgenes in only a subset of 
their cells. The advent of mammalian cloning, accomplished by fusing a 
1 0 nucleated cell with an enucleated egg, followed by transfer into a surrogate 

mother, presents the possibility of mpid, large-scale production upon obtaining 
a "founder" animal or cell comprising the introduced constract; prior to this, it 
was necessary for the transgene to be present in the germ line of the animal for 
propagation (Wilmut et al. (supra)). 
1 5 Expression in a host animal presents certain efficiencies, particularly 

where the host is a domesticated animal. For production of PUF As in a fluid 
readily obtainable from the host animal, such as milk, the desaturase transgene 
can be expressed in mammary cells from a female host, and the PUFA content 
of the host cells altered. The desaturase transgene can be adapted for expression 
20 so that it is retained in the mammary cells, or secreted into milk, to form the 

PUFA reaction products localized to the milk (PCT publication WO 95/24488). 
Expression can be targeted for expression in mammary tissue using specific 
regulatory sequences, such as those of bovine a-lactalbiraiin, a-casein, P- 
casein, y-casein, K-casein, P-lactoglobulin, or whey acidic protein, and may 
25 optionally include one or more introns and/or secretory signal sequences (U.S. 
Patent No. 5,530,177; Rosen, U.S. Patent No. 5,565,362; Clark et al, U.S. 
Patent No. 5,366,894; Gamer et al, PCT publication WO 95/23868). 
Expression of desaturase transgenes, or antisense desaturase transcripts, adapted 
in this manner can be used to alter the levels of specific PUF As, or derivatives 
30 thereof, found in the animals milk. Additionally, the A5-desaturase transgene 
can be expressed either by itself or with other transgenes, in order to produce 
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animal milk containing higher proportions of desired PUFAs or PUFA ratios 
and concentrations that resemble human breast milk (Prieto et al , PCT 
publication WO 95/24494). 

PUMFICATION OF FATTY ACIDS 

5 The fatty acids desaturated in the A5 position may be found in the host 

microorganism or animal as free fatty acids or in conjugated forms such as 
acylglycerols, phospholipids, sulfolipids or giycolipids, and may be extracted 
from the host cell through a variety of means well-known in the art. Such 
means may include extraction with organic solvents, sonication, supercritical 

10 fluid extraction using for example carbon dioxide, and physical means such as 
presses, or combinations thereof. Of particular interest is extraction with 
methanol and chloroform. Where desirable, the aqueous layer can be acidified 
to protonate negatively charged moieties and thereby increase partitioning of 
desired products into the organic layer. After extraction, the organic solvents 

15 can be removed by evaporation under a stream of nitrogen. When isolated in 
conjugated forms, the products may be enzymatically or chemically cleaved to 
release the free fatty acid or a less complex conjugate of interest, and can then 
be subject to further manipulations to produce a desired end product. Desirably, 
conjugated forms of fatty acids are cleaved with potassium hydroxide. 

20 If further purification is necessary, standard methods can be employed. 

Such methods may include extraction, treatment with urea, fractional 
crystallization, HPLC, fractional distillation, silica gel chromatography, high 
speed centrifugation or distillation, or combinations of these techniques. 
Protection of reactive groups, such as the acid or alkenyl groups, may be done at 

25 any step through known techniques, for example alkylation or iodination. 

Methods used include methylation of the fatty acids to produce methyl esters. 
Similarly, protecting groups may be removed at any step. Desirably, 
purification of fractions containing ARA, DHA and EPA may be accomplished 
by treatment with urea and/or fractional distillation. 
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USES OF FATTY ACIDS 

There are several uses for fatty acids of the subject invention. Probes 
based on the DNAs of the present invention may find use in methods for 
isolating related molecules or in methods to detect organisms expressing 
5 desaturases. When used as probes, the DNAs or oligonucleotides must be 
detectable. This is usually accomplished by attaching a label either at an 
intemal site, for example via incorporation of a modified residue, or at the 5' or 
3' terminus. Such labels can be directly detectable, can bind to a secondary 
molecule that is detectably labeled, or can bind to an unlabelled secondary 
1 0 molecule and a detectably labeled tertiary molecule; this process can be 
extended as long as is practical to achieve a satisfactorily detectable signal 
without unacceptable levels of background signal. Secondary, tertiary, or 
bridging systems can include use of antibodies directed against any other 
molecule, including labels or other antibodies, or can involve any molecules 
1 5 which bind to each other, for example a biotin-streptavidin/avidin system. 
Detectable labels typically include radioactive isotopes, molecules which 
chemically or enzymatically produce or alter light, enzymes which produce 
detectable reaction products, magnetic molecules, fluorescent molecules or 
molecules whose fluorescence or light-emitting characteristics change upon 
20 binding. Examples of labelling methods can be found in USPN 5,01 1 ,770. 
Alternatively, the binding of target molecules can be directly detected by 
measuring the change in heat of solution on binding of probe to target via 
isotheimal titration calorimetry, or by coatmg the probe or target on a surface 
and detecting the change in scattering of light firom the surface produced by 
25 binding of target or probe, respectively, as may be done with the BIAcore 
system. 

PUFAs produced by recombinant means find applications in a wide 
variety of areas. Supplementation of humans or animals with PUFAs in various 
forms can resuh in increased levels not only of the added PUFAs, but of their 
30 metabolic progeny as well. For example, where the inherent A5-desaturase 
pathway is dysfimctional in an individual, treatment with ARA can result not 
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only in increased levels of ARA, but also of downstream products of ARA such 
as prostaglandins (see Figure 1). Complex regulatory mechanisms can make it 
desirable to combine various PUFAs, or to add different conjugates of PUF As, 
in order to prevent, control or overcome such mechanisms to achieve the 
5 desired levels of specific PUFAs in an individual. 

NUTRITIONAL COMPOSITIONS 

The present invention also includes nutritional compositions. Such 
compositions, for purposes of the present invention, include any food or 
preparation for human consumption including for enteral or parenteral 
1 0 consumption, which when taken into the body (a) serve to nourish or build up 
tissues or supply energy and/or (b) maintain, restore or support adequate 
nutritional status or metabolic function. 

The nutritional composition of the present invention comprises at least 
one oil or acid produced in accordance with the present invention and may 
1 5 either be in a solid or liquid form. Additionally, the composition may include 
edible macronutrients, vitamins and minerals in amounts desired for a particular 
xise. The amoxmt of such ingredients will vary depending on whether the 
composition is intended for use with normal, healthy infants, children or adults 
having specialized needs such as those which accompany certain metabolic 
20 conditions (e.g., metabolic disorders). 

Examples of macronutrients which may be added to the composition 
include but are not lunited to edible fats, carbohydrates and proteins. Examples 
of such edible fats include but are not limited to coconut oil, soy oil, and mono- 
and diglycerides. Examples of such carbohydrates include but are not limited to 
25 glucose, edible lactose and hydrolyzed search. Additionally, examples of 

proteins which may be utilized in the nutritional composition of the invention 
include but are not limited to soy proteins, electrodialysed whey , 
electrodialysed skim milk, milk whey, or the hydrolysates of these proteins. 

With respect to vitamins and minerals, the following may be added to 
30 the nutritional compositions of the present invention: calcium, phosphorus. 
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potassium, sodium, chloride, magnesium, manganese, iron, copper, zinc, 
selenium, iodine, and Vitamins A, E, D, C, and the B complex. Other such 
vitamins and minerals may also be added. 

The components utilized in the nutritional compositions of the present 
5 invention will of semi-purified or purified origin. By semi-purified or purified 
is meant a material which has been prepared by purification of a natural 
material or by synthesis. 

Examples of nutritional compositions of the present invention include 
but are not limited to infant formulas, dietary supplements, and rehydration 
10 compositions. Nutritional compositions of particular interest include but are not 
limited to those utilized for enteral and parenteral supplementation for infants, 
specialist infant formulae, supplements for the elderly, and supplements for 
those with gastrointestinal difficulties and/or malabsorption. 

Nutritional Compositions 

15 A typical nutritional composition of the present invention will contain 

edible macronutrients, vitamins and minerals in amoxmts desired for a particular 
use. The amounts of such ingredients will vary depending on whether the 
formulation is intended for use with normal, healthy individuals temporarily 
exposed to stress, or to subjects having specialized needs due to certain chronic 

20 or acute disease states (e.g., metabolic disorders). It will be imderstood by 
persons skilled in the art that the components utilized in a nutritional 
formulation of the present invention are of semi-purified or purified origin. By 
semi-purified or purified is meant a material that has been prepared by 
piirification of a natural material or by synthesis. These techniques are well 

25 known in the art (See, e.g.. Code of Federal Regulations for Food Ingredients 
and Food Processing; Recommended Dietary Allowances, 10* Ed., National 
Academy Press, Washington, D.C., 1989). 

In a preferred embodiment, a nutritional formulation of the present 
invention is an enteral nutritional product, more preferably an adult or child 



-36- 



wo 98/46765 



PCTAJS98/»7422 



enteral nutritional product. Accordingly in a further aspect of the invention, a 
nutritional formulation is provided tiiat is suitable for feeding adults, who are 
experiencing stress. The formula comprises, in addition to tiie PUFAs of the 
invention; macronutrients, vitamins and minerals in amounts designed to 
5 provide the daily nutritional requirements of adults. 

The macronutritional components include edible fats, carbohydrates and 
proteins. Exemplary edible fats are coconut oil, soy oil, and mono- and 
diglycerides and the PUFA oils of tiiis mvention. Exemplary carbohydrates are 
glucose, edible lactose and hydrolyzed cornstarch. A typical protein source 
1 0 would be soy protein, electrodialysed whey or electixjdialysed skim milk or milk 
v^^ey, or the hydrolysates of these proteins, although other protein sources are 
also available and may be used. These macronutiients would be added in the 
form of commonly accepted nutritional compounds in amount equivalent to 
those present in human milk or an energy basis, i.e., on a per calorie basis. 
1 5 Methods for formulating liquid and enteral nutritional formulas are well 

known in the art and are described in detail in the examples. 

The enteral formula can be sterilized and subsequentiy utilized on a 
ready-to-feed (RTF) basis or stored in a concentrated liquid or a powder. The 
powder can be prepared by spray drying the enteral formula prepared as 
20 indicated above, and the formula can be reconstituted by rehydrating the 

concentrate. AduU and infant nutritional formulas are well known in the art and 
commercially available (e.g., Similac®, Ensure®, Jevity® and Alimentum® 
from Ross Products Division, Abbott Laboratories). An oil or acid of tiie 
present invention can be added to any of these formulas in the amounts 
25 described below. 

The energy density of the nutritional composition when in liquid form, 
can typically range from about 0.6 Kcal to 3.0 Kcal per ml. When in solid or 
powdered form, the nutritional supplement can contain from about 1 .2 to more 
than 9 Kcals per gm, preferably 3 to 7 Kcals per gram. In general, the 
30 osmolality of a liquid product shovdd be less than 700 mOsm and more 
preferably less than 660 mOsm. 
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The nutritional formula would typically include vitamins and minerals, 
in addition to the PUFAs of the invention, in order to help the individual ingest 
the minimum daily requirements for these substances. In addition to the PUFAs 
listed above, it may also be desirable to supplement the nutritional composition 

5 with zinc, copper, and folic acid in addition to antioxidants. It is believed that 
these substances will also provide a boost to the stressed immune system and 
thus will provide further benefits to the individual. The presence of zinc, 
copper or folic acid is optional and is not requured in order to gam the beneficial 
effects on immune suppression. Likewise a pharmaceutical composition can be 

1 0 supplemented with these same substances as well. 

In a more preferred embodiment, the nutritional contains, in addition to 
the antioxidant system and the PUFA component, a source of carbohydrate 
wherein at least 5 weight % of said carbohydrate is an indigestible 
oligosaccharide. In yet a more preferred embodiment, the nutritional 

1 5 composition additionally contains protein, taurine and carnitine. 

The PUFAs, or derivatives thereof, made by the disclosed method can 
be used as dietary substitutes, or supplements, particulariy infant formulas, for 
patients undergoing intravenous feeding or for preventing or treating 
malnutrition. Typically, human breast milk has a fatty acid profile comprising 
20 from about 0. 1 5 % to about 0.36 % as DHA, from about 0.03 % to about 0. 1 3 % 
as EPA, from about 0.30 % to about 0.88 % as ARA, from about 0.22 % to 
about 0.67 % as DGLA, and from about 0.27 % to about 1 .04 % as GLA. 
Additionally, the predominant triglyceride in human milk has been reported to 
be l,3-di-oleoyl-2-palmitoyl, with 2-palmitoyl glycerides reported as better 
25 absorbed than 2-oleoyl or 2-lineoyl glycerides (USPN 4,876,107), Thus, fatty 
acids such as ARA, DGLA, GLA and/or EPA produced by the invention can be 
used to alter the composition of infant formulas to better replicate the PUFA 
composition of human breast milk. In particular, an oil composition for use in a 
pharmacologic or food supplement, particularly a breast milk substitute or 
30 supplement, will preferably comprise one or more of ARA, DGLA and GLA. 
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More preferably the oil will comprise fkom about 03 to 30% ARA, from about 
0.2 to 30% DGLA, and from about 0.2 to about 30% GLA. 

In addition to the concentration, the ratios of ARA, DGLA and GLA can 
be adapted for a particular given end use. When formulated as a breast milk 
5 supplement, or substitute an oil composition which contains two or more of 
ARA, DGLA and GLA will be provided in a ratio of about 1 : 1 9:30 to about 
6:1 :0.2, respectively. For example, the breast milk of animals can vary in ratios 
of ARA:DGLA:DGL ranging from 1:19:30 to 6:1:0,2, which mcludes 
intermediate ratios which are preferably about 1 : 1 : 1 , 1 :2: 1 , 1 : 1 :4. When 
10 produced together in a host cell, adjusting the rate and percent of conversion of 
a precursor substrate such as GLA and DGLA to ARA can be used to precisely 
control the PUFA ratios. For example, a 5% to 10% conversion rate of DGLA 
to ARA can be used to produce an ARA to DGLA ratio of about 1:19, whereas 
a conversion rate of about 75% to 80% can be used to produce an ARA to 
1 5 DGLA ratio of about 6: 1 . Therefore, whether in a cell culture system or in a 
host animal, regulating the timing, extent and specificity of desaturase 
expression as described can be used to modulate the PUFA levels and ratios. 
Depending on the expression system used, e.g., cell culture or an animal 
expressing oil(s) in its milk, the oils also can be isolated and recombined in the 
20 desired concentrations and ratios. Amounts of oils providing these ratios of 
PUFA can be determined following standard protocols. PUFAs, or host cells 
containing them, also can be used as animal food supplements to alter an 
animal's tissue or milk fatty acid composition to one more desirable for human 
or animal consumption. 
25 For dietary supplementation, the purified PUFAs, or derivatives thereof, 

may be incorporated mto cooking oils, fats or margarines formulated so that in 
normal use the recipient would receive the desired amotmt. The PUFAs may 
also be incorporated into infant formulas, nutritional supplements or other food 
products, and may find use as anti-inflammatory or cholesterol lowering agents. 
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Pharmaceutical Compositi ns 

The present invention also encompasses a pharmaceutical composition 
comprising one or more of the acids and/or resulting oils produced in 
accordance with the methods described herein. More specifically, such a 

5 pharmaceutical composition may comprise one or more of the acids and/or oils 
as well as a standard, well-known, non-toxic pharmaceutically acceptable 
carrier, adjuvant or vehicle such as, for example, phosphate buffered saline, 
water, ethanol, polyols, vegetable oils, a wetting agent or an emulsion such as a 
water/oil emulsion. The composition may be in either a liquid or solid form. 

10 For example, the composition may be in the form of a tablet, capsule, ingestible 
liquid or powder, injectible, or topical ointment or cream. 

Possible routes of administration include, for example, oral, rectal and 
parenteral. The route of administration will, of course, depend upon the desired 
effect. For example, if the composition is being utilized to treat rough, dry, or 
1 5 aging skin, to treat injured or burned skin, or to treat skin or hair affected by a 
disease or condition, it may perhaps be applied topically. 

The dosage of the composition to be administered to the patient may be 
determined by one of ordmary skill in the art and depends upon various factors 
such as weight of the patient, age of the patient, immvme status of the patient, 
20 etc. 

With respect to form, the composition may be, for example, a solution, a 
dispersion, a suspension, an emulsion or a sterile powder which is then 
reconstituted. 

Additionally, the composition of the present invention may be utilized 
25 for cosmetic purposes. It may be added to pre-existing cosmetic compositions 
such that a mixture is formed or may be used as a sole composition. 

Pharmaceutical compositions may be utilized to administer the PUPA 
component to an individual. Suitable pharmaceutical compositions may 
comprise physiologically acceptable sterile aqueous or non-aqueous solutions, 
30 dispersions, suspensions or emulsions and sterile powders for reconstitution into 
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Sterile solutions or dispersions for ingestion. Examples of suitable aqueous and 
non-aqueous carriers, diluents, solvents or vehicles include water, ethanol, 
polyols (propyleneglycol. polyethyleneglycol, glycerol, and the like), suitable 
mixtures thereof, vegetable oils (such as olive oil) and injectable organic esters 

5 such as ethyl oleate. Proper fluidity can be maintained, for example, by the 
maintenance of the required particle size in the case of dispersions and by the 
use of surfactants. It may also be desirable to include isotonic agents, for 
example sugars, sodium chloride and the like. Besides such inert diluents, the 
composition can also include adjuvants, such as wetting agents, emulsifying and 

1 0 suspending agents, sweetening, flavoring and perfuming agents. 

Suspensions, in addition to the active compoimds, may contain 
suspending agents, as for example, ethoxylated isostearyl alcohols, 
polyoxyethylene sorbitol and sorbitan esters, microcrystalline cellulose, 
alumimmi metahydroxide, bentonite, agar-agar and tragacanth or mixtures of 
15 these substances, and the like. 

Solid dosage forms such as tablets and capsules can be prepared using 
techniques well known in the art. For example, PUFAs of the invention can be 
tableted with conventional tablet bases such as lactose, sucrose, and cornstarch 
in combination with binders such as acacia, cornstarch or gelatin, disintegrating 

20 agents such as potato starch or alginic acid and a lubricant such as stearic acid 
or magnesium stearate. Capsules can be prepared by incorporating these 
excipients into a gelatin capsule along with the antioxidants and the PUFA 
component. The amoimt of the antioxidants and PUFA component that should 
be incorporated into the pharmaceutical formulation should fit within the 

25 guidelines discussed above. 

As used in this application, the term "treat" refers to either preventing, or 
reducing the incidence of, the undesired occurrence. For example, to treat 
immime suppression refers to either preventing the occurrence of this 
suppression or reducing the amount of such suppression. The terms "patient" 
30 and "individual" are being used interchangeably and both refer to an animal. 
The term "animal" as used in this application refers to any warm-blooded 
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mammal including, but not limited to, dogs, humans, monkeys, and apes. As 
used in the application the term "about" refers to an amoimt varying from the 
stated range or number by a reasonable amount depending upon the context of 
use. Any numerical number or range specified in the specification should be 
5 considered to be modified by the term about* 

"Dose" and "serving" are used interchangeably and refer to the amount 
of the nutritional or pharmaceutical composition ingested by the patient in a 
single setting and designed to deliver effective amounts of the antioxidants and 
the structured triglyceride. As will be readily apparent to those skilled in the 

10 art, a single dose or serving of the liquid nutritional powder should supply the 
amount of antioxidants and PUFAs discussed above. The amount of the dose or 
serving should be a volume that a typical adult can consume in one sitting. This 
amount can vary widely depending upon the age, weight, sex or medical 
condition of the patient. However as a general guideline, a single serving or 

1 5 dose of a liquid nutritional produce should be considered as encompassing a 
volume from 100 to 600 ml, more preferably from 125 to 500 ml and most 
preferably from 125 to 300 ml. 

The PUFAs of the present invention may also be added to food even 
when supplementation of the diet is not required. For example, the composition 
20 may be added to food of any type including but not limited to margarines, 
modified butters, cheeses, milk, yogurt, chocolate, candy, snacks, salad oils, 
cooking oils, cooking fats, meats, fish and beverages. 

Pharmaceutical Applications 

For pharmaceutical use (human or veterinary), the compositions are 

25 generally administered orally but can be administered by any route by which 

they may be successfully absorbed, e.g., parenterally (i.e. subcutaneously, 

intramuscularly or intravenously), rectally or vaginally or topically, for 

example, as a skin ointment or lotion. The PUFAs of the present invention may 

be administered alone or in combination with a phaimaceutically acceptable 

30 carrier or excipient. Where available, gelatin capsules are the preferred form of 

oral administration. Dietary supplementation as set forth above also can 
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provide an oral route of administration. The unsaturated acids of the present 
invention may be administered in conjugated forms, or as salts, esters, amides 
or prodrugs of the fatty acids. Any pharmaceutically acceptable salt is 
encompassed by the present invention; especially preferred are the sodium, 
5 potassium or lithium salts. Also encompassed are the N-aUcylpolyhydroxamine 
salts, such as N-methyl glucamine, found in PCT publication WO 96/33155. 
The preferred esters are the ethyl esters. As solid salts, the PUFAs also can be 
administered in tablet form. For intravenous administration, the PUFAs or 
derivatives thereof may be incorporated into commercial formulations such as 

1 0 Intralipids. The typical normal adult plasma fatty acid profile comprises 6.64 to 
9.46% of ARA, 1.45 to 3.11% of DGLA, and 0.02 to 0.08% of GLA. These 
PUFAs or their metabolic precursors can be administered, either alone or in 
mixtures with other PUFAs, to achieve a normal fatty acid profile in a patient 
Where desired, the individual components of formulations may be individually 

1 5 provided in kit form, for single or multiple use. A typical dosage of a particular 
fatty acid is from 0. 1 mg to 20 g, or even 100 g daily, and is preferably fiom 10 
mg to 1 , 2, 5 or 10 g daily as required, or molar equivalent amounts of 
derivative forms thereof. Parenteral nutrition compositions comprising fix>m 
about 2 to about 30 weight percent fatty acids calculated as triglycerides are 

20 encompassed by the present invention; preferred is a composition having from 
about 1 to about 25 weight percent of the total PUFA composition as GLA 
(USPN 5,196,198). Other vitamins, and pardcularly fat-soluble vitamins such 
as vitamin A, D, E and L-camitine can optionally be included. Where desired, a 
preservative such as a tocopherol may be added, typically at about 0. 1% by 

25 weight. 

Suitable pharmaceutical compositions may comprise physiologically 
acceptable sterile aqueous or non-aqueous solutions, dispersions, suspensions or 
emulsions and sterile powders for reconstitution into sterile injectible solutions 
or dispersions. Examples of suitable aqueous and non-aqeuous carriers, 
30 diluents, solvents or vehicles include water, ethanol, polyols (propylleneglyol, 
polyethylenegycol, glycerol, and the like), suitable mixtures thereof, vegetable 
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oils (such as olive oil) and injectable organic esters such as ehyl oleate. Proper 
fluidity can be maintained, for example, by the maintenance of the required 
particle size in the case of dispersions and by the use of surfactants. It may also 
be desirable to include isotonic agents, for example sugars, sodium chloride and 
5 the like. Besides such inert diluents, the composition can also include 
adjuvants, such as wetting agents, emulsifying and suspending agents, 
sweetening, flavoring and perfuming agents. 

Suspensions in addition to the active compounds, may contain 
suspending agents, as for example, ethoxylated isostearyl alcohols, 
10 polyoxyethylene sorbitol and sorbitan esters, microcrystalline cellulose, 

aluminum metahydroxide, bentonite, agar-agar and tragacanth, or mixtures of 
these substances and the like. 

An especially preferred pharmaceutical composition contains 
diacetyltartaric acid esters of mono- and diglycerides dissolved in an aqueous 
15 medium or solvent. Diacetyltartaric acid esters of mono- and diglycerides have 
an HLB value of about 9-12 and are significantly more hydrophilic than existing 
antimicrobial lipids that have HLB values of 2-4. Those existing hydrophobic 
lipids cannot be formulated into aqueous compositions. As disclosed herein, 
those lipids can now be solubilized into aqueous media in combination with 
20 diacetyltartaric acid esters of mono-and diglycerides. In accordance with this 
embodiment, diacetyltartaric acid esters of mono- and diglycerides (e.g., 
DATEM-C12:0) is melted with other active antimicrobial lipids (e.g., 18:2 and 
12:0 monoglycerides) and mixed to obtain a homogeneous mixture. 
Homogeneity allows for increased antimicrobial activity. The mixture can be 
25 completely dispersed in water. This is not possible without the addition of 

diacetyltartaric acid esters of mono- £ind diglycerides and premixing with other 
monoglycerides prior to introduction into water. The aqueous composition can 
then be admixed xmder sterile conditions with physiologically acceptable 
diluents, preservatives, buffers or propellants as may be required to form a spray 
30 or inhalant. 
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The present invention also encompasses the treatment of numerous 
disorders with fatty acids. Supplementation with PUFAs of the present 
invention can be used to treat restenosis after angioplasty. Symptoms of 
inflammation, rheumatoid arthritis, and asthma and psoriasis can be treated with 
5 the PUFAs of the present invention. Evidence indicates that PUFAs may be 
involved in calcium metabolism, suggesting that PUFAs of the present 
invention may be used in the treatment or prevention of osteoporosis and of 
kidney or urinary tract stones. 

The PUFAs of the present invention can be used in the treatment of 
1 0 cancer. Malignant cells have been shown to have altered fatty acid 

compositions; addition of fatty acids has been shown to slow their growth and 
cause cell death, and to increase their susceptibility to chemotherapeutic agents. 
GLA has been shown to cause reexpression on cancer cells of the E-cadherin 
cellular adhesion molecules, loss of which is associated with aggressive 
1 5 metastasis. Clinical testing of intravenous administration of the water soluble 
lithium salt of GLA to pancreatic cancer patients produced statistically 
significant increases in their survival. PUFA supplementation may also be 
useful for treating cachexia associated with cancer. 

The PUFAs of the present invention can also be used to treat diabetes 
20 (USPN 4,826,877; Horrobin et al. Am. J. Clin. Nutr. Vol. 57 (SuppL), 732S- 
737S). Altered fatty acid metabolism and composition has been demonstrated 
in diabetic animals. These alterations have been suggested to be involved in 
some of the long-term complications resulting from diabetes, including 
retinopathy, neuropathy, nephropathy and reproductive system damage. 
25 Primrose oil, which contains GLA, has been shown to prevent and reverse 
diabetic nerve damage. 

The PUFAs of the present invention can be used to treat eczema, reduce 
blood pressure and improve math scores. Essential fatty acid deficiency has 
been suggested as being involved in eczema, and studies have shown beneficial 
30 effects on eczema from treatment with GLA. GLA has also been shown to 
reduce increases in blood pressure associated with stress, and to improve 
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perfoimance on arithmetic tests. GLA and DGLA have been shown to inhibit 
platelet aggregation, cause vasodilation, lower cholesterol levels and inhibit 
proliferation of vessel wall smooth muscle and fibrous tissue (Brenner et al. 
Adv. Exp. Med. Biol. Vol. 83, p. 85-101, 1976). Administration of GLA or 
S DGLA, alone or in combination with EPA, has been shown to reduce or prevent 
gastro-intestinal bleeding and other side effects caused by non-steroidal anti- 
inflammatory drugs (USPN 4,666,701). GLA and DGLA have also been shown 
to prevent or treat endometriosis and premenstrual syndrome (USPN 4,758,592) 
and to treat myalgic encephalomyelitis and chronic fatigue after viral infections 
10 (USPN 5,1 16,871). 

Further uses of the PUFAs of this invention include use in treatment of 
AIDS, multiple schlerosis, acute respiratory S3mdrome, hypertension and 
inflammatory skin disorders. The PUFAs of the inventions also can be used for 
formulas for general health as well as for geriatric treatments. 

15 Veterinary Applications 

It should be noted that the above-described pharmaceutical and 
nutritional compositions may be utilized in connection with animals, as well as 
humans, as animals experience many of the same needs and conditions as 
human. For example, the oil or acids of the present invention may be utilized in 
20 animal feed supplements. 

The following examples are presented by way of illustration, not of 
limitation. 



Examples 



25 



Example 1 



Isolation of a A5-desaturase Nucleotide Sequence from 
Mortierella alpina 



Example 2 



Expression of M alpina A5"desaturase Clones in Baker's 



Yeast 



Example 3 



Initial Optimization of Culture Conditions 
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Example 4 Distribution of PUF As in Yeast Lipid Fractions 



Example 5 



Further Culture Optimization 



Example 6 Identification of Homologues to M alpina AS and A6 
desaturases 



5 



Example 7 Identification of M. alpina AS and A6 homologues in 
other PUFA-producing organisms 



Example 8 Identification of M alpina AS and A6 homologues in 
other PUFA-producing organisms 



Example 9 Human Desaturase Sequences 



10 



Example 10 Nutritional Compositions 



Example 1 



Isolation of a AS-desaturase Nucleotide Sequence from Mortierella aloina 

Motierella alpina produces arachidonic acid (ARA, 20:4) from the 
precursor 20:3 by a AS-desaturase. A nucleotide sequence encoding the AS* 
1 S desaturase from Mortierella alpina was obtained through PCR amplification 
using M, alpina 1^ strand cDNA and degenerate oligonucleotide primers 
corresponding to amino acid sequences conserved between A6-desaturases firom 
Synechocystis and Spirulina, The procedure used was as follows: 



20 Mortierella alpina using the protocol of Hoge et al. (1982) Experimental 

Mycology 6:225-232. The RNA was used to prepare double-stranded cDNA 
using BRL*s lambda-ZipLox system, following the manufacturer's instructions. 
Several size fractions of the M alpina cDNA were packaged separately to yield 
libraries with different average-sized inserts. The "full-length" library contains 

25 approximately 3x10^ clones with an avers^e insert size of 1 .77 kb. The 
"sequencing-grade" library contains approximately 6x10^ clones with an 
average insert size of 1 . 1 kb. 



Total RNA was isolated from a 3 day old PUFA-producing ciilture of 
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5\xg of total RNA was reverse transcribed using BRL Superscript RTase 
and the primer TSyn (S'-CCAAGCTTCTGCAGGAGCTCTTTTITT 
TTTTTTTT-3'), SEQ ID NO: 10. Degenerate oligonucleotides were designed 
to regions conserved between the two cyanobacterial A6-desaturase sequences. 
5 The specific primers used were D6DESAT-F3 (SEQ ID NO:8) (5'- 

CUACUACUACUACAYCAYACOTAYACOAAYAT.3') and D6DESAT-R3 
(SEQ ID NO:9) (5'-CAUCAUCAUCAUOGGRAAOARRTGRTG-3'), where 
Y=C+T, R=A-Kj, and 0=I+C. PGR amplification was carried out in a 25^1 
volume containing: template derived from 40 ng total RNA, 2 pM each primer, 

10 200 each deoxyribonucleotide triphosphate, 60 mM Tris-Cl, pH 8.S, IS mM 
(NH4)2S04, 2 mM MgCh- Samples were subjected to an initial denaturation 
step of 95 degrees (all temperatures Celsius) for S minutes, then held at 72 
degrees while 0.2 U of Taq polymerase were added. PGR thermocycling 
conditions were as follows: 94 degrees for 1 min., 45 degrees for 1.5 min., 72 

15 degrees for 2 min. PGR was continued for 35 cycles. PGR using these primers 
on the M alpina first-strand cDNA produced a 550 bp reaction product. 
Comparison of the deduced amino acid sequence of the A/, alpina PGR 
fragment SEQ ID NP:3 revealed regions of homology with A6-desaturases (see 
Figure 5). However, there was only about 28% identity over the region 

20 compared. 

The PGR product was used as a probe to isolate corresponding cDNA 
clones from a M alpina library. The longest cDNA clone, Ma29, was 
designated pCGN5521 and has been completely sequenced on both strands. 
The cDNA is contained as a 1481 bp insert in the vector pZLl (Bethesda 

25 Research Laboratories) and, beginning with the first ATG, contains an open 
reading fi:ame encoding 446 amino acids. The reading frame contains the 
sequence deduced from the PGR fragment. The sequence of the cDNA insert 
was foimd to contain regions of homology to A6-desaturases (see Figure 5). For 
example, three conserved "histidine boxes" (that have been observed in 

30 membrane-boxmd desaturases (Okuley et aL, (1994) The Plant Cell (5: 147-158)) 
were found to be present in the Mortierella sequence at amino acid positions 

-48- 



wo 98/46765 



PCT/US98/07422 



171-175. 207-212, and 387-391 (see Figure 3). However, the typical 
"HXXHH" amino acid motif for the third histidine box for the Mortierella 
desaturase was found to be QXXHH, SEQ ID NO: 1 1-12. Surprisingly, the 
amino-terminus of the encoded protein, showed significant homology to 
5 cytochrome b5 proteins. Thus, the Mortierella cDNA clone appears to 

represent a fusion between a cytochrome b5 and a fatty acid desaturase. Since 
cjrtochrome bS is believed to function as the electron donor for membrane- 
bound desaturase enzymes, it is possible that the N-terminai cytochrome bS 
domain of this desaturase protein is involved in its function. This may be 
1 0 advantageous when expressing the desaturase in heterologous systems for 
PUFA production. 

Example 2 

Expression of M alpina Desaturase Clones in Baker*s Yeast 

Yeast Transformation 

15 Lithium acetate transforaiation of yeast was performed according to 

standard protocols {Methods in Enzymology, Vol. 194, p. 186-187, 1991). 
Briefly, yeast were grown in YPD at SO^'C. Cells were spim down, resuspended 
in TE, spun down again, resuspended in TE containing 100 mM lithium acetate, 
spun down again, and resuspended in TE/lithium acetate. The resuspended 

20 yeast were incubated at SO^'C for 60 minutes with shaking. Carrier DNA was 
added, and the yeast were aliquoted into tubes. Transforming DNA was added, 
and the tubes were incubated for 30 min. at 30°C. PEG solution (35% (w/v) 
PEG 4000, 100 mM lithium acetate, TE pH7.5) was added followed by a 50 
min. incubation at SO^'C. A 5 min. heat shock at 42**C was performed, the cells 

25 were pelleted, washed with TE, pelleted again and resuspended in TE. The 
resuspended cells were then plated on selective media. 

Desaturase Expression in Transformed Yeast 

The cDN A clones from Mortierella alpina were screened for desaturase 
activity in baker's yeast. A canola Al 5-desaturase (obtained by PCR using 1*^* 
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Strand cDNA ftom Brassica napus cultivar 212/86 seeds using primers based on 
the published sequence (Arondel et al Science 258:1353-1355)) was used as a 
positive control. The A15-desaturase gene and the gene from cDNA clone 
M a29 was inserted into the expression vector pYES2 Gnvitrogen), resulting in 
5 piasmids pCGR-2 and pCGR-4, respectively. These plasmids were transfected 
into 5. cerevisiae yeast strain 334 and expressed after induction with galactose 
and in the presence of substrates that allowed detection of specific desaturase 
activity. The control strain was 5. cerevisiae strain 334 containing the unaltered 
pYES2 vector. The substrates used, the products produced and the indicated 
10 desaturase activity were: DGLA (conversion to ARA would indicate A5- 
desaturase activity), linolenic acid (conversion to GLA would indicate A6- 
desaturase activity; conversion to ALA would indicate A15-desaturase activity), 
oleic acid (an endogenous substrate made by S. cerevisiae, conversion to 
linolenic acid would indicate A12-desaturase activity, which SI cerevisiae 
1 5 lacks), or ARA (conversion to EPA would indicate Al 7-desaturase activity). 

The results are provided in Table 1 below. The lipid fractions were extracted as 
follows: Cultures were grown for 48-52 hours at 15°C. Cells were pelleted by 
centrifugation, washed once with sterile ddH20, and repelleted. Pellets were 
vortexed with methanol; chloroform was added along with tritridecanoin (as an 
20 internal standard). The mixtures were incubated for at least one hoxir at room 
temperature or at 4**C overnight. The chloroform layer was extracted and 
filtered through a Whatman filter with one gram of anhydrous sodium sulfate to 
remove particulates and residual water. The organic solvents were evaporated 
at 40**C under a stream of nitrogen. The extracted lipids were then derivatized 
25 to fatty acid methyl esters (FAME) for gas chromatography analysis (GC) by 
adding 2 ml of 0.5 N potassium hydroxide in methanol to a closed tube. The 
samples were heated to 95°C to lOO^'C for 30 minutes and cooled to room 
temperature. Approximately 2 ml of 14 % boron trifluoride in methanol was 
added and the heating repeated. After the extracted lipid mixture cooled, 2 ml 
30 of water and 1 ml of hexane were added to extract the FAME for analysis by 
GC. The percent conversion was calculated by dividing the product produced 
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by the sum of (the product produced and the substrate added) and then 
multiplying by 100. To calculate the oleic acid percent conversion, as no 
substrate was added, the total linolenic acid produced was divided by the sum of 
(oleic acid and linolenic acid produced), then multiplying by 100. 
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Table 1 

Af. aipina Desaturase Expression in Baker's Yeast 

CLONE TYPE OF ENZYME % CONVERSION 

ACTIVITY OF SUBSTRATE 

pCGR-2 A6 0 (18:2 to 18:3a)6) 

(canolaAlS A15 163 (18:2 to 18:3a)3) 

desaturase) A5 2.0 (20:3 to 20:4a)6) 

A17 2.8 (20:4 to 20:5od3) 

A12 1.8 (18:1 to 18:20)6) 

pCGR-4 A6 0 

(M aipina A15 0 

Ma29) A5 153 

A17 03 

A12 33 

5 The Al 5-desaturase control clone exhibited 1 63% conversion of the 

substrate. The pCGR-4 clone expressing the Ma29 cDNA converted 153% of 
the 20:3 substrate to 20:4a)6, indicating that the gene encodes a A5-desaturase. 
The background (non-specific conversion of substrate) was between 0-3% in 
these cases. We also found substrate inhibition of the activity by using 

1 0 different concentrations of the substrate. When substrate was added to 100 ^M, 
the percent conversion to product dropped compared to when substrate was 
added to 25 \xlA (see below). Additionally, by varying the DGLA substrate 
concentrations, between about 5 |iM to about 200 jaM percent conversion of 
DGLA to ARA ranged from about 5% to 75% with the M aipina A5- 

IS desaturase. 
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These data show that desaturases with different substrate specificities 
can be expressed in a heterologous system and used to produce poly-unsaturated 
long chain fatty acids. 

Table 2 represents fatty acids of interest as a percent of the total lipid 
5 extracted from the yeast host S. cerevisiae 334 with the indicated plasmid. No 
glucose was present in the growth media. Affinity gas chromatography was 
used to separate the respective lipids. GC/MS was employed to verify the 
identity product(s). The expected product for the A napus A15-desaturase, a- 
linolenic acid, was detected when its substrate, linolenic acid, was added 

10 exogenously to the induced yeast culture. This finding demonstrates that yeast 
expression of a desaturase gene can produce functional enzyme and detectable 
amounts of product under the current growth conditions. Both exogenously 
added substrates were taken up by yeast, although slightly less of the longer 
chain PUFA, dihomo-y-linolenic acid (20:3), was incorporated into yeast than 

15 linolenic acid (1 8:2) when either was added in free form to the induced yeast 
cultures. Arachidonic acid was detected as a novel PUFA in yeast when 
dihomo-Y-linolenic acid was added as the substrate to S. cerevisiae 334 (pCGR- 
4). This identifies pCGR-4 (MA29) as the A5-desaturase from M alpina. Prior 
to this, no isolation and expression of a AS-desaturase from any source has been 

20 reported. 
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Example 3 
O ptimization of Culture Conditions 

Table 3 A shows the effect of exogenous free fatty acid substrate 
concentration on yeast uptake and conversion to fatty acid product as a 
5 percentage of the total yeast lipid extracted. In ail instances, low amounts of 

exogenous substrate (1-10 [iM) resulted in low fatty acid substrate uptake and 
product formation. Between 25 and SO |j,M concentration of free fatty acid in 
the growth and induction media gave the highest percentage of fatty acid 
product formed, while the 100 concentration and subsequent high uptake 

10 into yeast appeared to decrease or inhibit the desaturase activity. The feedback 

inhibition of high fatty acid substrate concentration was well illustrated when 
the percent conversion rates of the respective fatty acid substrates to their 
respective products were compared in Table 3B. In all cases, 100 ixM substrate 
concentration in the growth media decreased the percent conversion to product. 

15 The effect of media composition was also evident when glucose was present in 

the growth media for the A5-desaturase, since the percent of substrate uptake 
was decreased at 25 yiM (Table 3 A). However, the percent conversion by A5- 
desaturase increased by 1 8% and the percent product formed remained the same 
in the presence of glucose in the growth media. 
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Table 3A 

Effect of Added Substrate on the Percentage of Incorporated 
Substrate and Product Formed in Yeast Extracts 



Plasmid 


pCGR-2 


pCGR-4 


in Yeast 


(A15) 


(AS) 


substrate/product 


18:2 /a- 18:3 


20:3/20:4 


I fiM sub. 


ND 


0.5/1.7 


lO^M sub. 


ND 


3.3/4 


25 ^ M sub. 


ND 


5.1/6.1 


25 \xMO sub. 


36.6/7.20 


9.3/5.40 


50 sub. 


53.1/6.50 


ND 


100 sub. 


60.1/5.70 


32.3/5.80 



S Table 3B 

Effect of Substrate Concentration in Media on the Percent Conversion 
of Fatty Acid Substrate to Product in Yeast Extracts 



Plasmid in Yeast 


pCGR-2 
(A15) 


pCGR-4 
(AS) 


substrate/product 


18:2 ->a-18:3 


20:3->20:4 


1 sub. 


ND 


77,3 


10 sub. 


ND 


54.8 


25 |iM sub. 


ND 


54,2 


25 )iMO sub. 


16.4 


36.7 


50 sub. 


10.90 


ND 


100 sub. 


8.70 


15.20 



0 no glucose in media 

* Yeast peptone broth ( YPD) 



10 * 1 8: 1 is an endogenous yeast lipid 

sub. is substrate concentration 
ND (not done) 

Table 4 shows the amotmt of fatty acid produced by a recombinant 
1 5 desaturase from induced yeast cultures when different amounts of free fatty acid 

substrate were used. Fatty acid weight was determined since the total amount of 
lipid varied dramatically when the growth conditions were changed, such as the 
presence of glucose in the yeast growth and induction media. To better 
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determine the conditions when the recombinant desaturase would produce the 
most PUFA product, the quantity of individual fatty acids were examined. The 
absence of glucose reduced the amoimt of arachidonic acid produced by AS- 
desaturase by half. For AS-desaturase the amount of total yeast lipid was 
5 decreased by almost half in the absence of glucose. 



Table 4 

Fatty Acid Produced in ^g from Yeast Extracts 



Plasm id in Yeast 


pCGR-4 


pCGR-7 


(enzyme) 


(A5) 


(A12) 


product 


20:4 


18:2* 


1 sub. 


8.3 


ND 


1 0 sub. 


19.2 


ND 


25 M-M sub. 


31J2 


115.7 


25 fiM 0 sub. 


16.S 


39 0 



0 no glucose in media 
10 sub. is substrate concentration 

ND (not done) 
18:1, the substrate^ is an endogenous yeast lipid 

Example 4 

Distribution of PUFAs in Yeast Lipid Fractions 

1 5 Table S illustrates the uptake of free fatty acids and their new products 

formed in yeast lipids as distributed in the major lipid fractions. A total lipid 
extract was prepared as described above. The lipid extract was separated on 
TLC plates, and the fractions were identified by comparison to standards. The 
bands were collected by scraping, and internal standards were added. The 

20 fractions were then saponified and methylated as above, and subjected to gas 

chromatography. The gas chromatograph calculated the amount of fatty acid by 
comparison to a standard. It would appear that the substrates are accessible in 
the phospholipid form to the desaturases. 
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Table 5 

Fatty Acid Distribution in Various Yeast Lipid Fractions in ^g 



Fatty acid 
fraction 


Phospholipid 


Diglyceride 


Free Fatty 
Acid 


Triglyceride 


Cholesterol 
Ester 


SC (pCGR-4) 
substrate 20:3 


15.1 


1.9 


22.9 


12.6 


3.3 


SC (pCGR-4) 
product 20:4 


42.6 


0.9 


6.8 


4.9 


0.4 



SC = S. cerevisiae (plasmid) 



5 Example 5 

Further Culture Optimization 

The growth and induction conditions for optimal activities of 
desaturases in Saccharomyces cerevisiae were evaluated. Various culture 
conditions that were manipulated for optimal activity were: I) induction 
10 temperature, ii) concentration of inducer, iii) timing of substrate addition, iv) 

concentration of substance, v) sugar source, vi) growth phase at induction. 
These studies were done using AS-desaturase gene from Mortierella alpina 
(MA 29). In addition, the effect of changing host strain on expression of the 
A5-desaturase gene was also detemiined. 

15 As described above, the best rate of conversion of substrate to ARA was 

observed at a substrate concentration of 1 m-M, however, the percentage of ARA 
in the total fatty acids was highest at 25 jxM substrate concentration. To 
determine if the substrate needed to be modified to a readily available form 
before it could be utilized by the desaturase, the substrate was added either 15 

20 hours before induction or concomitant with inducer addition (indicated as after, 

in Figure 6A). As it can be seen in Figure 6A, addition of substrate before 
induction did not have a significant effect on the activity of AS-desaturase. In 
fact, addition of substrate along with the inducer was slightly better for 
expression/activity of A5-desaturase, as ARA levels in the total fatty acids were 
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higher. However, the rate of conversion of substrate to product was slightly 
lower. 

The effect of inducer concentration of expression/activity of Mortierella 
A5-desaturase was examined by inducing SC334/pCGR5 with 0.5 or 2% (w/v) 
5 of galactose. As shown in Figures 7A and 7B, expression of A5-desaturase was 

higher when induced with 0.5% galactose. Furthermore, rate of conversion of 
substrate to product was also better when SC334/pCGR5 was induced with 
0.5% galactose vs 2% galactose. 

To determine the effect of temperature on A5-desaturase activity, the 
1 0 SC334 host strain, transformed with pCGR5 (SC334/pCGR5) was grown and 

induced at 15^ C, 25*^0, 30^C and 'iT'C. The quantity of ARA (20:4n6) 
produced in SC334/pCGR5 cultures, supplemental with substrate 20:3n6, was 
measured by fatty acid analysis. Figure 8A depicts the quantity of 20:3n6 and 
20:4n6, expressed as percentage of total fatty acids. Figure 8B depicts the rate 
15 of conversion of substrate to product. Growth and induction of SC334/pCGR5 

at 25^C, was the best for the expression of AS-desaturase as evidenced by the 
highest levels of arachidonic acid in the total fatty acids. Additionally the 
highest rate of conversion of substrate to product also occurred at 25®C. 
Growth and induction at IS^C gave the lowest expression of ARA, whereas at 
20 ST^C gave the lowest conversion of substrate to product. 

The effect of yeast strain on expression of the A5-desaturase gene was 
studied in 5 different host strains; INVSCl, INVSC2, YTC34, YTC41, and 
SC334, at 15^C and 30^*0. At 15*'C, SC334 has the highest percentage of ARA 
in total fatty acids, suggesting higher activity of A5-desaturase in SC334. The 
25 rate of conversion of substrate to product, however is lowest in SC334 and 

highest in INVSCl (Fig. 9A and B). At 30°C, the highest percentage of product 
(ARA) in total fatty acids was observed in INVSC2, although the rate of 
conversion of substrate to product in INVSC2 was slightly lower than INVSCl 
(Fig. lOAandB). 
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ARA, the product of AS-desaturase, is stored in the phospholipid faction 
(Example 4). Therefore the quantity of ARA produced in yeast is limited by the 
amount that can be stored in the phospholipid fraction. If ARA could also be 
stored in other fractions such as the triglyceride fraction, the quantity of ARA 
S produced in yeast might be increased. To test this hypothesis, the AS-desaturase 

gene was expressed in the yeast host strain DBY746 (obtained from the Yeast 
Genetic Stock Centre, 1021 Donner Laboratory, Berkeley, CA 94720. The 
genotype of stram DBY746 is Mata, his3-Al, leu2-3, leu2-l 12, ura3-32, trpl- 
289, gal). The DBY746 yeast strain has an endogenous gene for choline 
10 transferase. The presence of this enzyme might enable the DBY746 strain to 

convert excess phospholipids into triglycerides fraction. Results in Fig. 1 1 
show no increase in the conversion of substrate to product as compared to 
SC334, which does not have the gene for choline transferase. 

To study the effect of media on expression of AS-desaturase, 
15 pCGR4/SC334 was grown in four different media at two different temperatures 

(IS^'C and 30^) and in two different host strains (SC334 and INVSCl). The 
composition of the media was as follows: 

Media A: mm-Ura, + 2% galactose + 2% glucose. 

Media B: mm-Ura, + 20% galactose + 2% Glucose + IM sorbitol (pH5.8) 

20 Media C: mm-Ura, + 2% galactose + 2% raffinose 

Media D: mm-Ura, + 2% galactose +2% raffinose + IM sorbitol (pH5.8) 

mm'^^minimal media 

Results show that the highest conversion rate of substrate to product at 
15**C in SC334 was observed in media A. The highest conversion rate overall 
25 for A5-desaturase in SC334 was at 30® in media D. The highest conversion rate 

of A5-desaturase in INVSCl was also at 30*' in media D (Figures 12A and 12B). 

These data show that a DNA encoding a desaturase that can convert 
DGLA to ARA can be isolated from Mortierella alpina and can be expressed in 
a heterologous system and used to produce poly-unsaturated long chain fatty 
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acids. Exemplified is the production of ARA from the precursor DGLA by 
expression of a AS-desaturase in yeast. 

Example 6 

Identification of Homologaes to M. alpina AS and A6 desaturases 

S A nucleic acid sequence that encodes a putative AS desaturase was 

identified through a TBLASTN search of the est databases through NCBI using 
amino acids 100-446 of Ma29 as a query. The truncated portion of the Ma29 
sequence was used to avoid picking up homologies based on the cytochrome bS 
portion at the N-terminus of the desaturase. The deduced amino acid sequence 
10 of an est from Dictyostelium discoideum (accession # C25549) shows very 

significant homology to Ma29 and lesser, but still significant homology to 
Ma524, The DNA sequence is presented as SEQ ID NO: 13. The amino acid 
sequence is presented as SEQ ID NO: 14. 

Examnle 7 

IS Identification of M. alpina AS and A6 homologucs in other 

PUFA-producing organisms 

To look for desaturases involved in PUFA production, a cDNA library 
was constructed from total RNA isolated from Phaeodactylum tricomutum. A 
plasmid-based cDNA library was constructed in pSPORTl (GIBCO-BRL) 
20 following manufacturer's instructions using a commercially available kit 

(GIBCO-BRL). Random cDNA clones were sequenced and nucleic acid 
sequences that encode putative A5 or A6 desaturases were identified through 
BLAST search of the databases and comparison to Ma29 and Ma524 sequences. 

One clone was identified from the Phaeodactylum library with 
25 homology to Ma29 and Ma524; it is called 144-01 1-B12. The DNA sequence is 

presented as SEQ ID NO: IS. The amino acid sequence is presented as SEQ ID 
NO:16. 

^1- 



wo 98/46765 



PCTAJS98/07422 



Example 8 

Identification of M alpina A5 and A6 homologues in other 
PUFA-producing organisms 

To look for desaturases involved in PUFA production, a cDNA library 
S was constructed from total RNA isolated from Schizochytrium species. A 

plasmid-based cDNA library was constructed in pSPORTl (GIBCO-BRL) 
following manufacturer's instructions using a commercially available kit 
(GIBCO-BRL). Random cDNA clones were sequenced and nucleic acid 
sequences that encode putative AS or A6 desaturases were identified through 
1 0 BLAST search of the databases and comparison to Ma29 and MaS24 sequences. 

One clone was identified fix>m the Schizochytrium library with 
homology to Ma29 and MaS24; it is called 81-23-C7. This clone contains a -1 
kb insert. Partial sequence was obtained from each end of the clone using the 
universal forward and reverse sequencing primers. The DNA sequence from 
IS the forward primer is presented as SEQ ID NO: 1 7. The peptide sequence is 

presented as SEQ ID NO: 1 8. The DNA sequence from the reverse primer is 
presented as SEQ ID NO: 19. The amino acid sequence from the reverse primer 
is presented as SEQ ID NO:20. 

Example 9 

20 Human Desaturase Gene Sequences 

Human desaturase gene sequences potentially involved in long chain 
polyunsaturated fatty acid biosynthesis were isolated based on homology 
between the human cDNA sequences and Mortierella alpina desaturase gene 
sequences. The three conserved "histidine boxes" known to be conserved 
25 among membrane-bound desaturases were found. As with some other 

membrane-boimd desaturases the final HXXHH histidine box motif was foimd 
to be QXXHH. The amino acid sequence of the putative human desaturases 
exhibited homology to M alpina AS» A6, A9, and A12 desaturases. 
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The M. alpina AS desaturase and A6 desaturase cDNA sequences were 
used to search the LifeSeq database of Inc3rte Pharmaceuticals, Inc., Palo Alto, 
California 94304. The AS desaturase sequence was divided into fragments; 1) 
amino acid no. 1-lSO, 2) amino acid no. lSl-300, and 3) amino acid no. 301- 
S 446. The A6 desaturase sequence was divided into three fragments; 1) amino 

acid no. 1-150, 2) amino acid no. lSl-300, and 3) amino acid no. 301-457. 
These polypeptide fragments were searched against the database using the 
"tblastn" algorithm. This alogarithm compares a protein query sequence against 
a nucleotide sequence database dynamically translated in all six reading frames 
10 (both strands). 

The polypeptide fragments 2 and 3 of M alpina A5 and A6 have 
homologies with the ClonelD sequences as outlined in Table 6. The ClonelD 
represents an individual sequence from the Incyte LifeSeq database. After the 
"tblastn" results have been reviewed. Clone Information was searched with the 

IS default settings of Stringency of >=50, and Productscore <=100 for different 

ClonelD numbers. The Clone Infomiation Results displayed the information 
including the ClusterlD, ClonelD, Library, HitID, Hit Description. When 
selected, the ClusterlD number displayed the clone information of all the clones 
that belong in that ClusterlD. The Assemble command assembles all of the 

20 ClonelD which comprise the ClusterlD. The following default settings were 

used for GCG (Genetics Computer Group, University of Wisconsin 
Biotechnology Center, Madison, Wisconsin 53705) Assembly: 



Word Size: 7 

25 Minimum Overlap: 14 

Stringency: 0.8 

Minimum Identity: 14 

Maximum Gap: 1 0 

Gap Weight: 8 

30 Length Weight: 2 



-63- 



wo 98/46765 



PCTAJS98/07422 



GCG Assembly Results displayed the contigs generated on the basis of 
sequence information within the ClonelD. A contig is an alignment of DNA 
sequences based on areas of homology among these sequences. A new 
5 sequence (consensus sequence) was generated based on the aligned DNA 

sequences within a contig. The contig containing the ClonelD was identified, 
and the ambiguous sites of the consensus sequence was edited based on the 
alignment of the ClonelDs (see SEQ ID NO:21 - SEQ ID NO:25) to generate 
the best possible sequence. The procedure was repeated for all six ClonelD 

10 listed in Table 6. This produced five unique contigs. The edited consensus 

sequences of the S contigs were imported into the Sequencher software program 
(Gene Codes Corporation, Ann Arbor, Michigan 48 105), These consensus 
seqtiences were assembled. The contig 25 1 1785 overlaps with contig 3S06132, 
and this new contig was called 2S3S (SEQ ID NO:27). The contigs from the 

IS Sequencher program were copied into the Sequence Analysis software package 

of GCG. 

Each contig was translated in all six reading frames into protein 
sequences. The M, alpina A5 (MA29) and A6 (MA524) sequences were 
compeired with each of the translated contigs using the FastA search (a Pearson 

20 and Lipman search for similarity between a query sequence and a group of 

sequences of the same type (nucleic acid or protein)). Homology among these 
sequences suggest the open reading frames of each contig. The homology 
among the M alpina A5 and A6 to contigs 2535 and 3854933 were utilized to 
create the final contig called 253538a. Figure 1 3 is the FastA match of the final 

25 contig 253538a and MA29, and Figure 14 is the FastA match of the final contig 

253538a and MA524. The DNA sequences for the various contigs are 
presented in SEQ ID NO:21 -SEQ ID NO:27. The various peptide sequences 
are shown in SEQ ID NO:28 - SEQ ID NO:34. 

Although the open reading frame was generated by merging the two 
30 contigs, the contig 2535 shows that there is a imique sequence in the beginning 
of this contig which does not match with the contig 3854933. Therefore, it is 
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possible that these contigs were generated from independent desaturase like 
human genes. 

The contig 253538a contains an open reading frame encoding 432 
amino acids. It starts with Gin (CAG) and ends with the stop codon (TGA). 
5 The contig 253538a aligns with both M alpina A5 and A6 sequences, 

suggesting that it could be either of the desaturases, as well as other known 
desaturases which share homology with each other. The individual contigs 
listed in Table 6, as well as the intermediate contig 2535 and the final contig 
253538a can be utilized to isolate the complete genes for human desaturases. 

10 Uses of the human desaturases 

These human sequences can be expressed in yeast and plants utilizing 
the procedures described in the preceding examples. For expression in 
manunalian cells and transgenic animals, these genes may provide superior 
codon bias. These human sequences can also be used to identify related 
1 5 desaturase sequences. 



Table 6 



Sections of the 
Desaturases 


Clone ID from LifeSeq Database 


Keyword 


151-300 A5 


3808675 


Fatty acid desaturase 


301-446 A5 


354535 


A6 


151-300 A6 


3448789 


A6 


151-300 A6 


1362863 


A6 


151-300 A6 


2394760 


A6 


301-457 A6 


3350263 


A6 



Example 10 



Nutritional Compositions 

20 The PUFAs of the previous examples can be utilized in various 

nutritional supplements, infant formulations, nutritional substitutes and other 
nutrition solutions. 

L INFANT FORMULATIONS 
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A. Is inil®S y Formula with Iron. 

Usage: As a beverage for infants, children and adults with an allergy or 
sensitivity to cow's milk. A feeding for patients with disorders for which 
lactose should be avoided: lactase deficiency, lactose intolerance and 
S galactosemia. 

Features: 

• Soy protein isolate to avoid symptoms of cow's-milk-protein allergy 
or sensitivity 

• Lactose-free formulation to avoid lactose-associated diarrhea 

10 • Low osmolaity (240 mOsm/kg water) to reduce risk of osmotic 

diarrhea. 

• Dual carbohydrates (com syrup and sucrose) designed to enhance 
carbohydrate absorption and reduce the risk of exceeding the absorptive 
capacity of the damaged gut. 

IS • 1 .8 mg of Iron (as ferrous siil&te) per 1 00 Calories to help prevent 

iron deficiency. 

• Recommended levels of vitamins and minerals. 

• Vegetable oils to provide recommended levels of essential fatty 
acids. 

20 • Milk-white color, milk-like consistency and pleasant aroma. 

Ingredients: (Pareve, ®) 85% water, 4.9% com syrup, 2.6% sugar 
(sucrose), 2.1% soy oil, 1,9% soy protein isolate, 1.4% coconut oil, 0.15% 
calciimi citrate, 0.1 1 % calcium phosphate tribasic, potassium citrate, potassium 
phosphate monobasic, potassium chloride, mono- and disglycerides, soy 
25 lecithin, carrageenan, ascorbic acid, L-methionine, magnesium chloride, 

potassium phosphate dibasic, sodiiun chloride, choline chloride, taurine, ferrous 
sulfate, m-inositol, alpha-tocopheryl acetate, zinc sulfate, L-camitine, 
niacinamide, calcium pantothenate, cupric sulfate, vitamin A palmitate, 
thiamine chloride hydrochloride, riboflavin, pyridoxine hydrochloride, folic 
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acid, manganese sulfate, potassium iodide, phylloquinone, biotin, sodiimi 
selenite, vitamin Da and cyanocobalamin. 

B. Isomil® DF Soy Formula For Diarrhea. 

Usage: As a short-term feeding for the dietary management of diarrhea 
5 in infants and toddlers. 

Features: 

• First infant formula to contain added dietary fiber from soy fiber 
specifically for diarrhea management. 

• Clinically shown to reduce the duration of loose, watery stools 
1 0 during mild to severe diarrhea in infants. 

• Nutritionally complete to meet the nutritional needs of the infant. 

• Soy protein isolate with added L*methionine meets or exceeds an 
infant's requirement for all essential amino acids. 

• Lactose-firee formulation to avoid lactose-associated diarrhea. 

IS • Low osmolality (240 mOsm/kg water) to reduce the risk of osmotic 

diarrhea. 

• Dual carbohydrates (com syrup and sucrose) designed to enhance 
carbohydrate absorption and reduce the risk of exceeding the absorptive 
capacity of the damaged gut. 

20 • Meets or exceeds the vitamin and mineral levels recommended by 

the Committee on Nutrition of the American Academy of Pediatrics and 
required by the Infant Formula Act. 

• 1.8 mg of iron (as ferrous sulfate) per 100 Calories to help prevent 
iron deficiency. 

25 • Vegetable oils to provide recommended levels of essential fatty 

acids. 

Ingredients: (Pareve, ®) 86% water, 4.8% com syrup, 2.5% sugar 
(sucrose), 2.1% soy oil, 2.0% soy protein isolate, 1.4% coconut oil, 0.77% soy 
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fiber, 0.12% calciiun citrate, 0.1 1 % calcium phosphate tribasic, 0.10% 
potassium citrate, potassium chloride, potassium phosphate monobasic, mono- 
and disglycerides, soy lecithin, carrageenan, magnesium chloride, ascorbic acid, 
L-methionine, potassiimi phosphate dibasic, sodium chloride, choline chloride, 
S taurine, ferrous sulfate, m-inositol, alpha-tocopheryl acetate, zinc sulfate, L- 

carnitine, niacinamide, calcium pantothenate, cupric sulfate, vitamin A 
palmitate, thiamine chloride hydrochloride, riboflavin, pyridoxine 
hydrochloride, folic acid, manganese sulfate, potassium iodide, phylloquinone, 
biotin, sodiiun selenite, vitamin D3 and cyanocobalamin. 

10 C. Isomil® SF Sucrose-Free Soy Formula With Iron. 

Usage: As a beverage for infants, children and adults with an allergy or 
sensitivity to cow's-milk protein or an intolerance to sucrose. A feeding for 
patients with disorders for which lactose and sucrose should be avoided. 

Features: 

1 S • Soy protein isolate to avoid symptoms of cow*s-milk-protein allergy 

or sensitivity. 

• Lactose-free formulation to avoid lactose-associated diarrhea 
(carbohydrate source is Polycose® Glucose Polymers). 

• Sucrose free for the patient who cannot tolerate sucrose. 

20 • Low osmolality (1 80 mOsm/kg water) to reduce risk of osmotic 

diarrhea. 

• 1.8 mg of iron (as ferrous sulfate) per 1 00 Calories to help prevent 
iron deficiency. 

• Recommended levels of vitamins and minerals. 

25 • Vegetable oils to provide recommended levels of essential fatty 

acids. 

• Milk-white color, milk-like consistency and pleasant aroma. 

Ingredients: (Pareve, ®) 75% water, 1 1.8% hydrolized cornstarch, 4.1% 
soy oil, 4.1% soy protein isolate, 2.8% coconut oil, 1.0% modified cornstarch, 

-68- 



wo 9S/46765 



PCT/US98/07422 



038% calcium phosphate tribasic, 0.17% potassium citrate, 0.13% potassium 
chloride, mono- and disglycerides, soy lecithin, magnesium chloride, abscorbic 
acid, L-methionine, calcium carbonate, sodium chloride, choline chloride, 
carrageenan, taurine, ferrous sulfate, m-inositol, alpha-tocopheryl acetate, zinc 
5 sulfate, L-camitine, niacinamide, calcium pantothenate, cupric sul&te, vitamin 

A paimitate, thiamine chloride hydrochloride, riboflavin, pyridoxine 
hydrochloride, folic acid, manganese sulfate, potassium iodide, phylloquinone, 
biotin, sodium selenite, vitamin D3 and cyanocobalamin. 

D. Isomil® 20 Soy Formula With Iron Ready To Feed, 
10 20 Cal/fl oz. 

Usage: When a soy feeding is desired. 

Ingredients: (Pareve, ®) 85% water, 4.9% com syrup, 2.6% sugar 
(sucrose), 2.1% soy oil, 1.9% soy protein isolate, 1,4% coconut oil, 0.15% 
calcium citrate, 0.1 1% calcium phosphate tribasic, potassium citrate, potassium 

1 5 phosphate monobasic, potassiimi chloride, mono- and disglycerides, soy 

lecithin, carrageenan, abscorbic acid, L-methionine, magnesium chloride, 
potassium phosphate dibasic, sodium chloride, choline chloride, taurine, ferrous 
sulfate, m-inositol, alpha-tocopheryl acetate, zinc sulfate, L-camitine, 
niacinamide, calcium pantothenate, cupric sulfate, vitamin A paimitate, 

20 thiamine chloride hydrochloride, riboflavin, pyridoxine hydrochloride, folic 

acid, manganese sulfate, potassium iodide, phylloquinone, biotin, sodium 
selenite, vitamin D3 and cyanocobalamin 

E. Similac® Infant Formula 

Usage: When an infant formula is needed: if the decision is made to 
25 discontinue breastfeeding before age 1 year, if a supplement to breastfeeding is 

needed or as a routine feeding if breastfeeding is not adopted. 

Features: 

• Protein of appropriate quality and quantity for good growth; heat- 
denatured, which reduces the risk of milk-associated enteric blood loss. 
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• Fat from a blend of vegetable oils (doubly homogenized), providing 
essential linoleic acid that is easily absorbed. 

• Carbohydrate as lactose in proportion similar to that of human milk. 

• Low renal solute load to minimize stress on developing organs. 

S • Powder, Concentrated Liquid and Ready To Feed forms. 

Ingredients: (®-D) Water, nonfat milk, lactose, soy oil, coconut oil, 
mono- and diglycerides, soy lecithin, abscorbic acid, carrageenan, choline 
chloride, taurine, m-inositol, alpha-tocopheryl acetate, zinc sulfate, niacinamid, 
ferrous sulfate, calcium pantothenate, cupric sulfate, vitamin A palmitate, 
10 thiamine chloride hydrochloride, riboflavin, pyridoxine hydrochloride, folic 

acid, manganese sulfate, phylloquinone, biotin, sodium selenite, vitamin D3 and 
cyanocobalamin. 

F. Similac® NeoCare Premature Infant Formula With Iron 

Usage: For premature infants* special nutritional needs after hospital 
1 S discharge. Similac NeoCare is a nutritionally complete formula developed to 

provide premature infants with extra calories, protein, vitamins and minerals 
needed to promote catch-up growth and support development. 

Features: 

• Reduces the need for caloric and vitamin supplementation. More 
20 calories (22 Cal/fl oz) then standard term formulas (20 Cal/fl oz). 

• Highly absorbed fat blend, with medium-chain triglycerides (MCT 
oil) to help meet the special digestive needs of premature in&nts. 

• Higher levels of protein, vitamins and minerals per 100 Calories to 
extend the nutritional support initiated in-hospital. 

25 • More calcium and phosphorus for improved bone mineralization. 

Ingredients: ®-D Com symp solids, nonfat milk, lactose, whey protein 
concentrate, soy oil, high-oleic safflower oil, fractionated coconut oil (medium- 
chain triglycerides), coconut oil, potassium citrate, calciimi phosphate tribasic, 
calcium carbonate, ascorbic acid, magnesium chloride, potassium chloride, 
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sodium chloride, taurine, ferrous sulfate, m-inositol, choline chloride, ascorbyl 
palmitate, L-camitine, alpha-tocopheiyl acetate, zinc sulfate, niacinamide, 
mixed tocopherols, sodium citrate, calcium pantothenate, cupric sulfate, 
thiamine chloride hydrochloride, vitamin A palmitate, beta carotene, riboflavin, 
5 pyridoxine hydrochloride, folic acid, manganese sulfate, phylloquinone, biotin, 

sodium selenite, vitamin D3 and cyanocobalamin. 

G. Similac Natural Care Low-Iron Human Milk Fortifier Ready 
To Use, 24 Cal/fl oz. 

Usage: Designed to be mixed with human milk or to be fed alternatively 
1 0 with human milk to low-birth-weight infants. 

Ingredients: ®-D Water, nonfat milk, hydrolyzed cornstarch, lactose, 
fractionated coconut oil (medium-chain triglycerides), whey protein 
concentrate, soil oil, coconut oil, calciimi phosphate tribasic, potassium citrate, 
magnesium chloride, sodium citrate, ascorbic acid, calcium carbonate, mono- 

15 and diglycerides, soy lecithin, carrageenan, choline chloride, m-inositol, taurine, 

niacinamide, L-camitine, alpha tocopheryl acetate, ^nc sulfate, potassium 
chloride, calcium pantothenate, ferrous sulfate, cupric sulfate, riboflavin, 
vitamin A palmitate, thiamine chloride hydrochloride, pyridoxine 
hydrochloride, biotin, folic acid, manganese sulfate, phylloquinone, vitamin Da, 

20 sodium selenite and cyanocobalamin. 

Various PUFAs of this invention can be substituted and/or added to the 
infant formulae described above and to other infiant formulae known to those in 
the art.. 

II. NUTRITIONAL FORMULATIONS 

25 A, ENSURE® 

Usage: ENSURE is a low-residue liquid food designed primarily as an 
oral nutritional supplement to be used with or between meals or, in appropriate 
amounts, as a meal replacement. ENSURE is lactose- and gluten-free, and is 
suitable for use in modified diets, including low-cholesterol diets. Although it 
30 is primarily an oral supplement, it can be fed by tube. 
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Patient C nditi ns: 

• For patients on modified diets 

• For elderly patients at nutrition risk 

• For patients with involuntary weight loss 

5 • For patients recovering from illness or surgery 

• For patients who need a low-residue diet 
Ingredients: 

®-D Water, Sugar (Sucrose), Maltodextrin (Com), Calcium and Sodium 
Caseinates, High-Oleic Safflower Oil, Soy Protein Isolate, Soy Oil, Canola Oil, 

10 Potassium Citrate, Calcium Phosphate Tribasic, Sodium Citrate, Magnesium 

Chloride, Magnesium Phosphate Dibasic, Artificial Flavor, Sodiimi Chloride, 
Soy Lecithin, Choline Chloride, Ascorbic Acid, Carrageenan, Zinc Sulfate, 
Ferrous Sulfate, Alpha-Tocopheryl Acetate, Gellan Gum, Niacinamide, 
Calcium Pantothenate, Manganese Sulfate, Cupric Sulfate, Vitamin A 

IS Palmitate, Thiamine Chloride Hydrochloride, Pyridoxine Hydrochloride, 

Riboflavin, Folic Acid, Sodium Molybdate, Chromium Chloride, Biotin, 
Potassium Iodide, Sodium Selenate. 



ENSURE® BARS 

20 Usage: ENSURE BARS are complete, balanced nutrition for 

supplemental use between or with meals. They provide a delicious, nutrient- 
rich alternative to other snacks. ENSURE BARS contain <1 g lactose/bar, and 
Chocolate Fudge Brownie flavor is gluten-free. (Honey Graham Crunch flavor 
contains gluten.) 

25 Patient Conditions: 

• For patients who need extra calories, protein, vitamins and minerals 

• Especially useful for people who do not take in enough calories and 
nutrients 
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• For people who have the ability to chew and swallow 

• Not to be used by anyone with a peanut allergy or any type of allergy to 
nuts. 

Ingredients: 

5 Honey Graham Crunch High-Fructose Com Syrup, Soy Protein 

Isolate, Brown Sugar, Honey, Maltodextrin (Com), Crisp Rice (Milled Rice, 
Sugar [Sucrose], Salt [Sodium Chloride] and Malt), Oat Bran, Partially 
Hydrogenated Cottonseed and Soy Oils, Soy Polysaccharide, Glycerine, Whey 
Protein Concentrate, Polydextrose, Fmctose, Calcium Caseinate, Cocoa 
1 0 Powder, Artificial Flafors, Canola Oil, High-Oleic Safflower Oil, Nonfat Dry 

Milk, Whey Powder, Soy Lecithin and Com Oil. Manufactured in a facility that 
processes nuts. 

Vitamins and Minerals: 

Calcium Phosphate Tribasic, Potassium Phosphate Dibasic, Magnesium 
15 Oxide, Salt (Sodium Chloride), Potassium Chloride, Ascorbic Acid, Ferric 

Orthophosphate, Alpha-Tocopheryl Acetate, Niacinamide, Zinc Oxide, Calcium 
Pantothenate, Copper Gluconate, Manganese Sulfate, Riboflavin, Beta- 
Carotene, Pyridoxine Hydrochloride, Thiamine Mononitrate, Folic Acid, Biotin, 
Chromium Chloride, Potassium Iodide, Sodixmi Selenate, Sodium Molybdate, 
20 Phylloquinone, Vitamin D3 and Cyanocobalamin. 

Protein: 

Honey Graham Crunch - The protein source is a blend of soy protein isolate 
and milk proteins. 

Soy protein isolate 74% 
25 Milk proteins 26% 

Fat: 

Honey Graham Crunch - The fat source is a blend of partially 
hydrogenated cottonseed and soybean, canola, high oleic safflower, and com 
oils, and soy lecithin. 
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Partially hydrogenated cottonseed and soybean oil 76% 



Canola oil 8% 

High-oleic safflower oil 8% 

Com oil 4% 

5 Soy lecithin 4% 

Carbohydrate: 



Honey Graham Crunch - The carbohydrate source is a combination of 
high«-fructose com synip, brown sugar, maltodextrin, honey, crisp rice, 
glycerine, soy polysaccharide, and oat bran. 



10 High-fructose com symp 24% 

Brown sugar 2 1 % 

Maltodextrin 12% 

Honey 11% 

Crisp rice 9% 

15 Glycerine 9% 

Soy polysaccharide 7% 

Oat bran 7%\ 



C. ENSURE® HIGH PROTEIN 

20 Usage: ENSURE HIGH PROTEIN is a concentrated, high-protein 

liquid food designed for people who require additional calories, protein, 
vitamins, and minerals in their diets. It can be used as an oral nutritional 
supplement with or between meals or, in appropriate amounts, as a meal 
replacement. ENSURE HIGH PROTEIN is lactose- and gluten-free, and is 

25 suitable for use by people recovering from general surgery or hip fractures and 

by patients at risk for pressure ulcers. 

Patient Conditions 

• For patients who require additional calories, protein, vitamins, and minerals, 
such as patients recovering from general surgery or hip fractures, patients at risk 
30 for pressure ulcers, and patients on low-cholesterol diets 



-74- 



wo 98/46765 



PCT/US98/07422 



Features- 

• Low in saturated fat 

• Contains 6 g of total fat and < 5 mg of cholesterol per serving 

• Rich, creamy taste 

5 • Excellent source of protein, calcium, and other essential vitamins and 

minerals 

• For low-cholesterol diets 

• Lactose-free, easily digested 
Ingredients: 

10 Vanilla Supreme: -®-D Water, Sugar (Sucrose), Maltodextrin (Com), Calcium 

and Sodium Caseinates, High-Oleic Safflower Oil, Soy Protein Isolate, Soy Oil, 
Canola Oil, Potassium Citrate, Calcium Phosphate Tribasic, Sodiimi Citrate, 
Magnesium Chloride, Magnesium Phosphate Dibasic, Artificial Flavor, Sodium 
Chloride, Soy Lecithin, Choline Chloride, Ascorbic Acid, Carrageenan, Zinc 

15 Sulfate, Ferrous Suffate, Alpha-Tocopheryl Acetate, Gellan Gum, Niacinamide, 

Calcium Pantothenate, Manganese Sulfate, Cupric Sulfate, Vitamin A 
Palmitate, Thiamine Chloride Hydrochloride, Pyridoxine Hydrochloride, 
Riboflavin, Folio Acid, Sodium Motybdate, Chromium Chloride, Biotin, 
Potassium Iodide, Sodium Selenate, Phylloquinone, Vitamin D.3 and 

20 Cyanocobalamin. 

Protein: 

The protein source is a blend of two high-biologic-value proteins: casein and 
soy. 

Sodium and calcium caseinates 85% 
25 Soy protein isolate 15% 

Fat: 

The fat source is a blend of three oils: high-oleic safflower, canola, and soy. 
High-oleic safflower oil 40% 
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Canola oil 30% 

Soy oil 30% 

The level of fat in ENSURE HIGH PROTEIN meets American Heart 
Association (AHA) guidelines. The 6 grams of fat in ENSURE HIGH 
5 PROTEIN represent 24% of the total calories, with 2.6% of the fat being from 

saturated fatty acids and 7.9% from polyunsaturated fatty acids. These values 
are within the AHA guidelines of < 30% of total calories from fat, < 1 0% of the 
calories from saturated fatty acids, and < 1 0% of total calories from 
polyunsaturated fatty acids. 

10 Carbohydrate: 

ENSURE HIGH PROTEIN contains a combination of maltodextrin and 
sucrose. The mild sweetness and flavor variety (vanilla supreme, chocolate 
royal, wild berry, and banana), plus VARI-FLAVORSO® Flavor Pacs in pecan, 
cherry, strawberry, lemon, and orange, help to prevent flavor fatigue and aid in 
1 5 patient compliance. 

Vanilla and other nonchocolate flavors 



Sucrose 60% 

Maltodextrin 40% 
Chocolate 

20 Sucrose 70% 

Maltodextrin 30% 



D. ENSURE ® LIGHT 

Usage: ENSURE LIGHT is a low-fat liquid food designed for use as an 
25 oral nutritional supplement with or between meals. ENSURE LIGHT is 

lactose- and gluten-free, and is suitable for use in modified diets, including low- 
cholesterol diets. 

Patient Conditions: 
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• For normal-weight or overweight patients who need extra nutrition in a 
supplement that contains 50% less fat and 20% fewer calories than ENSURE 

• For healthy adults who don't eat right and need extra nutrition 
Features: 

S • Low in fat and saturated fat 

• Contains 3 g of total fat per serving and < 5 mg cholesterol 

• Rich, creamy taste 

• Excellent source of calcium and other essential vitamins and minerals 

• For low*cholesterol diets 

10 • Lactose-free, easily digested 

Ingredients: 

French Vanilla: ^-D Water, Maltodextrin (Com), Sugar (Sucrose), Calcium 
Caseinate, High-Oleic Safflower Oil, Canola Oil, Magnesium Chloride, Sodium 
Citrate, Potassium Citrate, Potassium Phosphate Dibasic, Magnesiiun Phosphate 

1 5 Dibasic, Natural and Artificial Flavor, Calcium Phosphate Tribasic, Cellulose 

Gel, Choline Chloride, Soy Lecithin, Canrageenan, Salt (Sodium Chloride), 
Ascorbic Acid, Cellulose Gum, Ferrous Sulfate, Alpha-Tocopheryl Acetate, 
Zinc Sulfate, Niacinamide, Manganese Sulfate, Calcium Pantothenate, Cupric 
Sul&te, Thiamine Chloride Hydrochloride, Vitamin A Palmitate, Pyridoxine 

20 Hydrochloride, Riboflavin, Chromium Chloride, Folic Acid, Sodium 

Molybdate, Biotin, Potassium Iodide, Sodium Selenate, Phylloquinone, Vitamin 
D3 and Cyanocobalamin. 

Protein: 

The protein source is calcium caseinate. 
25 Calcium caseinate 100% 

Fat 

The fat source is a blend of two oils: high-oleic safflower and canola. 
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High-oleic saf&ower oil 70% 
Canola oil 30% 

The level of fat in ENSURE LIGHT meets American Heart Association 
(AHA) guidelines. The 3 grams of fat in ENSURE LIGHT represent 13.5% of 
5 the total calories, with 1 .4% of the fat being from saturated fatty acids and 2.6% 

from polyimsaturated fatty acids. These values are within the AHA guidelines 
of < 30% of total calories from fat, < 1 0% of the calories from saturated fatty 
acids, and < 1 0% of total calories from polyunsaturated fatty acids. 

Carbohydrate 

1 0 ENSURE LIGHT contains a combination of maltodextrin and sucrose. 

The chocolate flavor contains com syrup as well. The mild sweetness and 
flavor variety (French vanilla, chocolate supreme, strawberry swirl), plus 
VARI-FLAVORS® Flavor Pacs in pecan, cherry, strawbeny, lemon, and 
orange, help to prevent flavor fatigue and aid in patient compliance. 

1 5 Vanilla and other nonchocolate flavors 



Sucrose 51% 

Maltodextrin 49% 
Chocolate 

Sucrose 47.0% 

20 Com Syrup 26.5% 

Maltodextrin 26.5% 



Vitamins and Minerals 

An 8-fl-oz serving of ENSURE LIGHT provides at least 25% of the 
RDIs for 24 key vitamins and minerals. 

25 Caffeine 

Chocolate flavor contains 2.1 mg caffeine/8 fl oz. 
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E. ENSURE PLUS® 

Usage: ENSURE PLUS is a high-calorie, low-residue liquid food for 
use when extra calories and nutrients, but a normal concentration of protein, are 
needed. It is designed primarily as an oral nutritional supplement to be used 
S with or between meals or, in appropriate amounts, as a meal replacement. 

ENSURE PLUS is lactose- and gluten-free. Although it is primarily an oral 
nutritional supplement, it can be fed by tube. 

Patient Conditions: 

• For patients who require extra calories and nutrients, but a normal 
10 concentration of protein, in a limited volume 

• For patients who need to gain or maintain healthy weight 
Features 

• Rich, creamy taste 

• Good soiut^e of essential vitamins and minerals 

IS Ingredients 

Vanilla: ®-D Water, Com Syrup, Maltodextrin (Com), Cora Oil, Sodium and 
Calcium Caseinates, Sugar (Sucrose), Soy Protein Isolate, Magnesium Chloride, 
Potassiiun Citrate, Calcium Phosphate Tribasic, Soy Lecithin, Natural and 
Artificial Flavor, Sodium Citrate, Potassium Chloride, Choline Chloride, 

20 Ascorbic Acid, Carrageenan, Zinc Sulfate, Ferrous Sulfate, Alpha-Tocopheryl 

Acetate, Niacinamide, Calcium Pantothenate, Manganese Sulfate, Cupric 
Sulfate, Thiamine Chloride Hydrochloride, Pyridoxine Hydrochloride, 
Riboflavin, Vitamin A Palmitate, Folic Acid, Biotin, Chromixim Chloride, 
Sodium Molybdate, Potassium Iodide, Sodium Selenite, Phylloquinone, 

25 Cyanocobalamin and Vitamin D3. 

Protein 

The protein source is a blend of two high-biologic-value proteins: casein 
and soy. 

Sodium and calcium caseinates 84% 
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Soy protein isolate 1 6% 

Fat 

The fat source is com oil. 

Com oil 100% 

S Carbohydrate 

ENSURE PLUS contains a combination of maltodextrin and sucrose. 
The mild sweetness and flavor variety (vanilla, chocolate, strawberry, coffee, 
buffer pecan, and eggnog), plus VARI-FLAVORS® Flavor Pacs in pecan, 
cherry, strawberry, lemon, and orange, help to prevent flavor fatigue and aid in 
1 0 patient compliance. 

Vanilla^ strawberry, butter pecan, and coffee flavors 



Com Symp 39% 

Maltodextrin 38% 

Sucrose 23% 
1 S Chocolate and eggnog flavors 

Com Symp 36% 

Maltodextrin 34% 

Sucrose 30% 
Vitamins and Minerals 



20 An 8-fl-oz serving of ENSURE PLUS provides at least 15% of the RDIs 

for 25 key Vitamins and minerals. 

Caffeine 

Chocolate flavor contains 3.1 mg CafTeine/S fl oz. Coffee flavor 
contains a trace amount of caffeine. 

25 

F. ENSURE PLUS® HN 
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Usage: ENSURE PLUS HN is a nutritionally complete high-calorie, 
high-nitrogen liqiud food designed for people with higher calorie and protein 
needs or limited volume tolerance. It may be used for oral supplementation or 
for total nutritional support by tube. ENSURE PLUS HN is lactose- and gluten- 
5 free. 

Patient Conditions: 

• For patients with increased calorie and protein needs, such as following 
surgery or injury 

• For patients with limited volume tolerance and early satiety 
10 Features 

• For supplemental or total nutrition 

• For oral or tube feeding 

• LSCaVmL 

• High nitrogen 

15 • Calorically dense 

Ingredients 

Vanilla: ®-D Water, Maltodextrin (Com), Sodium and Calcium Caseinates, 
Com Oil, Sugar (Sucrose), Soy Protein Isolate, Magnesium Chloride, Potassium 
Citrate, Calciiun Phosphate Tribasic, Soy Lecithin, Natural and Artificial 

20 Flavor, Sodium Citrate, Choline Chloride, Ascorbic Acid, Taurine, L-Camitine, 

Zinc Sulfate, Ferrous Sulfate, Alpha-Tocopheryl Acetate, Niacinamide, 
Carrageenan, Calcium Pantothenate, Manganese Sulfate, Cupric Sulfate, 
Thiamine Chloride Hydrochloride, Pyridoxine Hydrochloride, Riboflavin, 
Vitamin A Palmitate, Folic Acid, Biotin, Chromiimi Chloride, Sodium 

25 Molybdate, Potassiimi Iodide, Sodium Selenite, Phylloquinone, 

Cyanocobalamin and Vitamin D3. 

G. ENSURE® POWDER 
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Usage: ENSURE POWDER (reconstituted with water) is a low-residue 
liquid food designed primarily as an oral nutritional supplmient to be used with 
or between meals. ENSURE POWDER is lactose- and gluten-free, and is 
suitable for use in modified diets, including low-cholesterol diets. 

S Patient Conditions: 

• For patients on modified diets 

• For elderly patients at nutrition risk 

• For patients recovering from illness/surgery 

• For patients who need a low-residue diet 
10 Features 

• Convenient, easy to mix 

• Low in saturated fat 

• Contains 9 g of total fat and < S mg of cholesterol per serving 

• High in vitamins and minerals 
15 • For low-cholesterol diets 

• Lactose-free, easily digested 

Ingredients: ®-D Com Syrup, Maltodextrin (Com), Sugar (Sucrose), Com Oil, 
Sodium and Calcium Caseinates, Soy Protein Isolate, Artificial Flavor, 
Potassium Citrate, Magnesium Chloride, Sodium Citrate, Calcium Phosphate 

20 Tribasic, Potassium Chloride, Soy Lecithin, Ascorbic Acid, Choline Chloride, 

Zinc Sulfate, Ferrous Sulfate, Alpha-Tocopheryl Acetate, Niacinamide, 
Calcium Pantothenate, Manganese Sul&te, Thiamine Chloride Hydrochloride, 
Cupric Sulfate, Pyridoxine Hydrochloride, Riboflavin, Vitamin A Palmitate, 
Folic Acid, Biotin, Sodium Molybdate, Chromium Chloride, Potassium Iodide, 

25 Sodiimi Selenate, Phylloquinone, Vitamin D3 and Cyanocobalamin. 

Protein 

The protein source is a blend of two high-biologic-value proteins: casein 
and soy. 
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Sodium and calcium caseimites 



84% 



Soy protein isolate 



16% 



Fat 



The fat source is com oil. 



Com oil 



100% 



10 



15 



20 



Carbohydrate 

ENSURE POWDER contains a combination of com syrup, 
maltodextrin, and sucrose. The mild sweetness of ENSURE POWDER, plus 
V ARI-FLAVORS® Flavor Pacs in pecan, cherry, strawberry, lemon, and 
orange, helps to prevent flavor fatigue and aid in patient compliance. 

Vanilla 

Com Syrup 35% 

Maltodextrin 35% 

Sucrose 30% 

H. ENSURE® PUDDING 

Usage: ENSURE PUDDING is a nutrient-dense supplement providing 
balanced nutrition in a nonliquid form to be used with or between meals. It is 
appropriate for consistency-modified diets (e.g., soft, pureed, or full liquid) or 
for people vdth swallowing impairments. ENSURE PUDDING is gluten-free. 

Patient Conditions: 

• For patients on consistency-modified diets (e.g., soft, pureed, or full liquid) 

• For patients with swallowing impairments 

• Features 

• Rich and creamy, good taste 

• Good source of essential vitamins and minerals Convenient-needs no 
refrigeration 
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• Gluten-fiee 

Nutrient Profile per 5 oz: Calories 250, Protein 10.9%, Total Fat 34.9%, 

Carbohydrate 54.2% 

Ingredients: 

5 VaniUa: ®-D Nonfat Milk, Water, Sugar (Sucrose), Partially Hydrogenated 

Soybean Oil, Modified Food Starch, Magnesium Sulfate. Sodium Stearoyl 
Lactylate, Sodium Phosphate Dibasic, Artificial Flavor, Ascorbic Acid, Zinc 
Sulfate, Ferrous Sulfate, Alpha-Tocopheryl Acetate, Choline Chloride, 
Niacinamide, Manganese Sulfate, Calcium Pantothenate, FD&C Yellow #5, 

10 Potassium Citrate, Cupric Sulfate, Vitamin A Palmitate, Thiamine Chloride 

Hydrochloride, Pyridoxine Hydrochloride, Riboflavin, FD&C Yellow #6, Folic 
Acid, Biotin, Phylloquinone, Vitamin D3 and Cyanocobalamin. 

Protein 

The protein source is nonfat milk. 
Nonfat milk 100^^ 



15 



Fat 



The fat source is hydrogenated soybean oil. 
Hydrogenated soybean oil 1 00% 

Carbohydrate 

20 ENSURE PUDDING contmns a combination of sucrose and modified 

food starch. The mild sweetness and flavor variety (vanilla, chocolate, 
butterscotch, and tapioca) help prevent flavor fatigue. The product contains 9.2 
grams of lactose per serving. 
Vanilla and other nonchocolate flavors 
25 Sucrose 56% 

Lactose 27% 
Modified food starch 17% 
Chocolate 
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Sucrose 58% 
Lactose ^6% 
Modified food starch 1 6% 



'6 



5 I. ENSURE® WITH FIBER 

Usage: ENSURE WITH FIBER is a fiber-containing, nutritionally 
complete liquid food designed for people who can benefit from uicreased 
dietary fiber and nutrients. ENSURE WITH FIBER is suitable for people who 
do not require a low-residue diet It can be fed orally or by tube, and can be 

10 used as a nutritional supplement to a regular diet or, in appropriate amounts, as 

a meal replacement ENSURE WITH FIBER is lactose- and gluten-free, and is 
suitable for use in modified diets, including low-cholesterol diets. 

Patient Conditions 

• For patients who can benefit firom mcreased dietary fiber and nutrients 
IS Features 

• New advanced formula-low in saturated fat, higher in vitamins and minerals 

• Contains 6 g of total fat and < 5 mg of cholesterol per serving 

• Rich, creamy taste 

• Good source of fiber 

20 • Excellent sovurce of essential vitamins and minerals 

• For low-cholesterol diets 

• Lactose- and gluten-free 
Ingredients 

VanUIa: ®-D Water, Maltodextrin (Com), Sugar (Sucrose), Sodium and 
25 Calcium Caseinates, Oat Fiber, High-Oleic Safflower Oil, Canola Oil, Soy 

Protein Isolate, Com Oil, Soy Fiber, Calcium Phosphate Tribasic, Magnesium 
Chloride, Potassium Citrate, Cellulose Gel, Soy Lecithin, Potassium Phosphate 
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Dibasic, Sodium Citrate, Natural and Artificial Flavors, Choline Chloride, 
Magnesium Phosphate, Ascorbic Acid, Cellulose Gum, Potassium Chloride, 
Carrageenan, Ferrous Sulfete, Alpha-Tocopheryl Acetate, Zinc Sulfate, 
Niacinamide, Manganese Sulfate, Calcium Pantothenate, Cupric Sulfate, 
5 Vitamin A Pahnitate, Thiamine Chloride Hydrochloride, Pyridoxine 

Hydrochloride, Riboflavin, Folic Acid, Chromium Chloride, Biotin, Sodium 
Molybdate, Potassium Iodide, Sodium Selenate, Phylloquinone, Vitamin D3 and 
Cyanocobalamin. 

Protein 

10 The protein source is a blend of two high-biologic-value proteins- casein 

and soy. 

Sodium and calcium caseinates 80% 
Soy protein isolate 20% 

Fat 

15 The fat source is a blend of three oils: high-oleic safflower, canola, and 



com. 



High-oleic safiQower oil 40% 

Canola oil 40% 

Com oil 20% 
20 The level of fat in ENSURE WITH FIBER meets American Heart 

Association (AHA) guidelines. The 6 grams of fat in ENSURE WITH FIBER 
represent 22% of the total calories, with 2.01 % of the fat being from saturated 
fatty acids and 6.7% from polyunsaturated fatty acids. These values are within 
the AHA guidelines of < 30% of total calories from fat, < 1 0% of the calories 
25 from saturated fatty acids, and < 1 0% of total calories from polyunsaturated 

fatty acids. 
Carbohydrate 

ENSURE WITH FIBER contains a combination of maltodextrin and 
sucrose. The mild sweetness and flavor variety (vanilla, chocolate, and butter 
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pecan), plus VARI-FLAVORS® Flavor Pacs in pecan, cherry, strawberry, 
lemon, and orange, help to prevent flavor fatigue and aid in patient compliance. 

Vanilla and other nonchocolate flavors 

Maltodextrin ^6% 
5 Sucrose 

Oat Fiber 7% 
Soy Fiber 2% 
Chocolate 

Maltodextrin 55% 
10 Sucrose 36% 

Oat Fiber 7% 
Soy Fiber 2% 

Fiber 

The fiber blend used in ENSURE WITH FIBER consists of oat fiber and 
1 5 soy polysaccharide. This blend results in approximately 4 grams of total dietaiy 

fiber per 8-fl-oz can. The ratio of insoluble to soluble fiber is 95:5. 

The various nutritional supplements described above and known to 
others of skill in the art can be substituted and/or supplemented with the PUF As 
of this invention. 
20 J. Oxepa™ Nutritional Product 

Oxepa is low-carbohydrate, calorically dense enteral nutritional product 
designed for the dietary management of patients with or at risk for ARDS. It 
has a unique combination of ingredients, including a patented oil blend 
containing eicosapentaenoic acid (EPA from fish oil), Y-Unolenic acid (GLA 
25 fi-om borage oil), and elevated antioxidant levels. 

Caloric Distribution: 
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Caloric density is high at 1 .5 Cal/mL (355 Cal/8 fl oz). to minimize the 
volume required to meet energy needs. 

• The distribution of Calories in Oxepa is shown in Table 7. 



Table 7. Caloric Distribution of Oxepa 




per 811 oz. 


per liter 


Vo orcai 


Calories 


355 


1,500 




Fat(fi) 


22.2 


93.7 


55.2 


Carbohydrate (g) 


25 


105.5 


28.1 


Protein (g) 


14.8 


62.5 


16.7 


Water (g) 


186 


785 





5 Fat: 

• Oxepa contains 22.2 g of fat per 8-fl oz serving (93.7 g/L). 

• The fat source is a oil blend of 3 1.8% canola oil, 25% medium-chain 
triglycerides (MCTs), 20% borage oil, 20% fish oil, and 3.2 % soy lecithin. The 
typical fatty acid profile of Oxepa is shown in Table 8. 

1 0 • Oxepa provides a balanced amount of polyunsaturated, monounsaturated, 

and saturated fatty acids, as shown in Table 10. 

• Medium-chain trigylcerides (MCTs) 25% of the fat blend - aid gastric 
emptying because they are absorbed by the intestinal tract wdthout 
emulsification by bile acids. 

1 5 The various fatty acid components of Oxepa^w nutritional product can 

be substituted and^or supplemented with the PUF As of this invention. 



Table 8. Typical Fatty Acid ProHlc 




% Total Fatty 
Acids 


g/8 fl oz* 




Caproic (6:0) 


0.2 


0.04 


0.18 


Caprylic (8:0) 


14.69 


3,1 


13.07 


Capric (10:0) 


11.06 


2.33 


9.87 


Palmitic (16:0) 


5.59 


1.18 


4.98 


Palmitoleic (l6:ln-7) 


1.82 


0.38 


1.62 


Stearic (18:0) 


1.84 


0.39 


1.64 
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01eicfl8:ln-9) 


24.44 


5.16 


21.75 


Linoieic (i8:2n-o} 




3.44 


14.49 


a-Linolenic (l8:3n-3) 


3.47 


0,73 


1 09 


Y-Linolenic(18:3n-6) 


4.82 


1.02 




Eicosapentaenoic (20:Sn- 
3) 


5.11 


1.08 


4.55 


n-3 -Docosapentaenoic 
(22:5n-3) 


0.55 


0.12 


0.49 


Docosahexaenoic (22:6n- 


2.27 


0.48 


2.02 


3) 








Others 


7.55 


1.52 


6.72 


* Fatty acids equal approximately 95% of total fat. 


Table 9. Fat Profile of Oxepa. 


% of total calories from fat 


55.2 


Polyunsaturated fatty acids 


31.44g/L 


Monounsaturated fatty acids 


25.53 R/L 


Saturated fatty acids 


32.38 e/L 


n-6 to n-3 ratio 


1.75:1 


Cholesterol 


9.49 mg/8 fl oz 
40.1 mg/L 



Carbohydrate: 

• The carbohydrate content is 25.0 g per 8-fl-oz serving (105.5 g/L). 

5 • The carbohydrate sources are 45% tnaltodextrin (a complex carbohydrate) 

and 55% sucrose (a simple sugar), both of which are readily digested and 
absorbed. 

• The high-fat and low-carbohydrate content of Oxepa is designed to 
minimize carbon dioxide (CO2) production. High CO2 levels can complicate 

10 weaning in ventilator-dependent patients. The low level of carbohydrate also 

may be useful for those patients who have developed stress-induced 
hyperglycemia. 

• Oxepa is lactose-£ree. 

Dietary carbohydrate, the amino acids from protein, and the glycerol 
1 5 moiety of fats can be converted to glucose within the body. Throughout this 

process, the carbohydrate requirements of glucose-dependent tissues (such as 
the central nervous system and red blood cells) are met. However, a diet free of 
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carbohydrates can lead to ketosis, excessive catabolism of tissue protein, and 
loss of fluid and electrolytes. These effects can be prevented by daily ingestion 
of 50 to 100 g of digestible carbohydrate, if caloric intake is adequate. The 
carbohydrate level in Oxepa is also sufficient to minimize gluconeogenesis, if 
energy needs are being met. 
Protein: 

• Oxepa contains 14.8 g of protein per 8-fl-oz serving (62.5 g/L). 

• The total calorie/nitrogen ratio (150:1) meets the need of stressed patients. 

• Oxepa provides enough protein to promote anabolism and the maintenance 
of lean body mass without precipitating respiratory problems. High protein 
intakes are a concern in patients with respiratory insufficiency. Although 
protein has little effect on CO2 production, a high protein diet will increase 
ventilatory drive. 

• The protein sources of Oxepa are 86.8% sodium caseinate and 13,2% 
calcium caseinate. 



All publications and patent applications mentioned in this specification 
are indicative of the level of skill of those skilled in the art to which this 
invention pertains. All publications and patent applications are herein 
incorporated by reference to the same extent as if each individual publication or 
patent application was specifically and individually indicated as incorporated by 
reference. 

The invention now being fully described, it will be apparent to one of 
ordmary skill in the art that many changes and modifications can be made 
thereto without departing firom the spirit or scope of the appended claims. 
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SEQUENCE LISTING 



(1) GENERAL. INFORMATION; 

^ (i) APPLICANT: KNUTZON, DEBORAH 

MURKERJI, PRADIP 
HUANG, YUNG-SHENG 
THURMOND. JENNIFER 
10 CHAUDHARY, SUNITA 

LEONARD, AMANDA 



15 



<ii) TITLE OF INVENTION: METHODS AND COMPOSITIONS FOR 

SYNTHESIS OF LONG CHAIN POLY -UNSATURATED FATTY ACIDS 

(iii) NUMBER OF SEQUENCES: 34 



(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: LIMBACH & LIMBACH LLP 
20 <B) STREET: 2001 FERRY BUILDING 

<C) CITY: SAN FRANCISCO 

(D) STATE: CALIFORNIA 

(E) COUNTRY: USA 
<F) ZIP: 94111 

25 

(v) COMPXTTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

<B) CCM4PUTBR: IBM PC compatible ' 
(C) OPERATING SYSTEM: PC-DOS/MS-DOS 
30 <D) SOFTWARE: Patentin Release #1-0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 
(A> APPLICATION NUMBER: 

(B) FILING DATE: 
35 (C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: MICHAEL R. WARD 

(B) REGISTRATION NUMBER: 38,651 

40 (C) REFERENCE/DOCKET NUMBER: CGAB-110 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (415) 433-4150 

(B) TELEFAX: (415) 433-8716 
45 (C) TELEX: N/A 

(2) INFORMATION FOR SEQ ID NO:l: 

50 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 83 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDSDNESS : single 

(D) TOPOLOGY: linear 

55 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 
GCTTCCTCCA GTTCATCCTC CATTTCGCCA CCTGCATTCT TTACGACCGT TAAGCAAGAT 
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10 



20 



30 



GGGAACGGAC CAAGGAAAAA CCTTCACCTG GGAAGAGCTG GCGGCCCATA ACACCAAGGA 120 

CGACCTACTC TTGGCCATCC GCGGCAGGGT GTACGATGTC ACAAAGTTCT TGAGCCGCCA 180 

TCCTGGTGGA GTGGACACTC TCCTGCTCGG AGCTXMCCGA GATOTTACTC CGGTCTTTGA 240 

GATGTATCAC QCX3TTTGGGG CTGCAGATGC CATTATGAAG AAGTACTATG TCX5GTACACT 300 

GGTCTCX5AAT GAGCTGCCCA TCTTCCCGGA GCCAACGGTG TTCCACAAAA CCATCAAGAC 360 

GAOAGTCGAG GOCTACTTTA CGGATCGGAA CATTGATCCC AAGAATAGAC CAGAGATCTG 420 

GGGACGATAC GCTCTTATCT TTGGATCCTT GATCGCTTCC TACTACGCGC AGCTCTTTGT 480 

15 GCCTTTCGTT GTCGAACGCA CATGGCTTCA GGTGGTGTTT GCAATCATCA TGGGATTTGC 540 

GTGCGCACAA GTCGGACTCA ACCCTCTTCA TGATGCGTCT CACTTTTCAG TOACCCACAA 600 

CCCCACTGTC TGGAAGATTC TGGGAGCCAC GCACGACTTT TTCAACGGAG CATCGTACCT 660 

GGTGTGGATG TACCAACATA TGCTCX5GCCA TCACCCCTAC ACCAACATTG CTGGAGCAGA 720 

TCCCGACGTG TCGACGTCTG AGCCCX5ATGT TCGTCGTATC AAGCCCAACC AAAAGTGGTT 780 
25 TGTCAACCAC ATCAACCAGC ACATGTTTGT TCCTTTCCTG TACGGACTGC TGGCGTTCAA 840 
GGTGCGCATT CAGGACATCA ACATTTTGTA CTTTGTCAAG ACCAAT6ACG CTATTCGTGT 900 
CAATCCCATC TCGACATGGC ACACTGTGAT GTTCTGGGGC GGCAAGGCTT TCTTTGTCTG 960 

GTATCGCCTG ATTGTTCCCC TGCAGTATCT GCCCCTGGGC AAGGTGCTGC TCTTGTTCAC 1020 

GGTCGCGGAC ATGGTGTCGT CTTACTGGCT GGCGCTGACC TTCCAGGCGA ACCACX5TTGT 1080 

35 - TGAGGAAGTT CAGTGGCCGT TGCCTGACGA GAACGGGATC ATCCAAAAGG ACTGGGCAGC 1140 

TATGCAGGTC GAGACTACGC AGGATTACGC ACACGATTCG CACCTCTGGA CCAGCATCAC 1200 

TGGCAGCTTG AACTACCAGG CTGTGCACCA TCTGTTCCCC AACGTGTCX^C AGCACCATTA 1260 

TCCCGATATT CTGGCCATCA TCAAGAACAC CTGCAGCGAG TACAAGGTTC CATACCTTGT 1320 

CAAGGATACG TTTTGGCAAG CATTTGCTTC ACATTTGGAG CACTTGCGTG TTCTTGGACT 1380 

45 CCGTCCCAAG GAAGAGTAGA AGAAAAAAAG CGCCGAATGA AGTATTGCCC CCTTTTTCTC 1440 

CAAGAATGGC AAAAGGAGAT CAAGTGGACA TTCTCTATGA AGA 1*83 
(2) INFORMATION FOR SEQ ID NO : 2 : 

50 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 446 amino acids 

(B) TYPE: amino acid 

<C) STRANDEDNESS : not: relevant 
55 <D) TOPOLOGY: linear 

(ii) MOIiECUIiE TYPE: peptide 

60 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 

Met Gly Thr Asp Gin Gly Lys Thr Phe Thr Trp Glu Glu Leu Ala Ala 
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15 10 15 

Hi8 Asn Thr Lys Asp Asp Leu Ueu Leu Ala He Arg Gly Arg Val Tyr 
20 25 30 

^ ASP Val Thr Lys Phe Leu Ser Arg His Pro Gly Gly Val Asp Thr Leu 

35 40 45 

Leu Leu Gly Ala Gly Arg Asp Val Thr Pro Val Phe Glu Met Tyr His 
10 50 55 60 

Ala Phe Gly Ala Ala Asp Ala He Met Lys Lys Tyr Tyr Val Gly Thr 
65 70 75 80 

15 Leu Val Ser Asn Glu Leu Pro He Phe Pro Glu Pro Thr Val Phe His 

85 90 



20 



Lys Thr He Lys Thr Arg Val Glu Gly Tyr Phe Thr Asp Arg Asn He 
100 105 HO 

Asp Pro Lys Asn Arg Pro Glu He Trp Gly Arg Tyr Ala Leu He Phe 
115 120 125 

Gly ser Leu He Ala Ser Tyr Tyr Ala Gin Leu Phe Val Pro Phe Val 
25 130 135 140 

val Glu Arg Thr Trp Leu Gin Val Val Phe Ala He He Met Gly Phe 
145 150 155 160 

30 Ala Cys Ala Gin Val Gly Leu Asn Pro Leu His Asp Ala Ser His Phe 

165 170 175 



35 



Ser val Thr His Asn Pro Thr Val Trp Lys He Leu Gly Ala Thr His 
180 185 190 

Asp Phe Phe Asn Gly Ala Ser Tyr Leu Val Trp Met Tyr Gin His Met 
195 200 205 

Leu Gly His His Pro Tyr Thr Asn He Ala Gly Ala Asp Pro Asp Val 
40 210 2X5 220 

Ser Thr Ser Glu Pro Asp Val Arg Arg He Lys Pro Asn Gin Lys Trp 
225 230 235 240 

45 Phe Val Asn Hie He Asn Gin His Met Phe Val Pro Phe Leu Tyr Gly 

245 250 255 



50 



Leu Leu Ala Phe Lys Val Arg He Gin Asp He Asn He Leu Tyr Phe 
260 265 270 

Val Lys Thr Asn Asp Ala He Arg Val Asn Pro He Ser Thr Trp His 
275 280 285 



Thr val Met Phe Trp Gly Gly Lys Ala Phe Phe Val Trp Tyr Arg Leu 
55 290 295 300 

He Val Pro Leu Gin Tyr Leu Pro Leu Gly Lys Val Leu I.eu Leu Phe 
305 310 315 320 

60 Thr val Ala Asp Met Val Ser Ser Tyr Trp Leu Ala Leu Thr Phe Gin 

325 330 335 
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Ala Asn His Val Val Glu Glu Val Gin Trp Pro Leu Pro Asp Glu Asn 
340 345 350 

Gly lie He Gin Lys Asp Trp Ala Ala Met Gin Val Glu Thr Thr Gin 
5 355 360 365 

Asp Tyr Ala His Asp Ser His Leu Trp Thr Ser He Thr Gly Ser Leu 
370 375 380 

10 Asn Tyr Gin Ala val His His Leu Phe Pro Asn Val Ser Gin His His 

390 395 400 



15 



40 



55 



385 



Tyr Pro Asp He Leu Ala He He Lys Asn Thr Cys Ser Glu Tyr Lys 
405 410 415 

Val Pro Tyr Leu Val Lys Asp Thr Phe Trp Gin Ala Phe Ala Ser His 
420 425 430 



Leu Glu His Leu Arg Val Leu Gly Leu Arg Pro Lys Glu Glu 
20 435 440 445 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE C3JARACTERISTICS : 
25 (A) LENGTH: 186 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOIiOGY: linear 

30 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: 

35 Leu His His Thr Tyr Thr Asn He Ala Gly Ala Asp Pro Asp Val Ser 

1 5 10 15 



Thr Ser Glu Pro Asp Val Arg Arg He Lys Pro Asn Gin Lys Trp Phe 

20 25 30 

Val Asn His He Asn Gin His Met Phe Val Pro Phe Leu Tyr Gly Leu 
35 40 45 



Leu Ala Phe Lys Val Arg He Gin Asp He Asn He Leu Tyr Phe Val 
45 50 55 60 

Lys Thr Asn Asp Ala He Arg Val Asn Pro He Ser Thr Trp His Thr 
65 70 75 80 

50 Val Met Phe Trp Gly Gly Lys Ala Phe Phe Val Trp Tyr Arg Leu He 

85 90 95 



Val Pro Leu Gin Tyr Leu Pro Leu Gly Lys Val Leu Leu Leu Phe Thr 
100 105 110 

Val Ala Asp Met Val Ser Ser Tyr Trp Leu Ala Leu Thr Phe Gin Ala 
115 120 125 



Asn Tyr Val Val Glu Glu Val Gin Trp Pro Leu Pro Asp Glu Asn Gly 
60 130 135 140 

He He Gin Lys Asp Trp Ala Ala Met Gin Val Glu Thr Thr .Gin Asp 
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145 



150 155 



10 



25 



30 



45 



60 



Tyr Ala His Asp Ser His Leu Trp Thr Ser He Thr Gly Ser Leu Asn 
165 170 175 

Tyr Gin Xaa Val His His Leu Phe Pro His 
180 185 

(2) INFORMATION FOR SEQ ID NO: 4: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 57 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 
15 (D) TOPOLOGY: linear 

(ii) r«>LECULE TYPE: peptide 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:4: 

Met Ala Ala Ala Pro Ser Val Arg Thr Phe Thr Arg Ala Glu Val Leu 
1 5 10 15 

Asn Ala Glu Ala Leu Asn Glu Gly Lys Lys Asp Ala Glu Ala Pro Phe 
20 25 30 

Leu Met He He Asp Asn Lys Val Tyr Asp Val Arg Glu Phe Val Pro 
35 40 45 

Asp His Pro Gly Gly Ser Val He Leu Thr His Val Gly Lys Asp Gly 
50 55 60 

Thr ASP Val Phe Asp Thr Phe His Pro Glu Ala Ala Trp Glu Thr Leu 
35 65 70 75 80 

Ala Asn Phe Tyr Val Gly Asp He Asp Glu Ser Asp Arg Asp He Lys 
85 50 95 

40 Asn Asp Asp Phe Ala Ala Glu Val Arg Lys Leu Arg Thr Leu Phe Gin 

100 105 110 



Ser Leu Gly Tyr Tyr Asp Ser Ser Lys Ala Tyr Tyr Ala Phe Lys Val 
115 120 125 

Ser Phe Asn Leu Cys He Trp Gly Leu Ser Thr Val He Val Ala Lys 
130 135 140 



Trp Gly Gin Thr Ser 



Thr Leu Ala Asn Val Leu Ser Ala Ala Leu Leu 



50 145 150 155 160 

Gly Leu Phe Trp Gin Gin Cys Gly Trp Leu Ala His Asp Phe Leu His 
165 170 175 

55 His Gin Val Phe Gin Asp Arg Phe Trp Gly Asp Leu Phe Gly Ala Phe 

180 185 190 



Leu Gly Gly Val Cys Gin Gly Phe Ser Ser Ser Trp Trp Lys Asp Lys 
195 200 205 

His Asn Thr His His Ala Ala Pro Asn Val His Val Glu Asp Pro Asp 
210 215 220 
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lie ASP Thr Hie Pro Leu Leu Thr Trp Ser Glu His Ala Leu Glu Met 
225 230 235 240 

Phe ser Asp Val Pro Asp Glu Glu Leu Thr Arg Met Trp Ser Arg Phe 
245 250 255 

Met val Leu Asn Gin Thr Trp Phe Tyr Phe Pro He Leu Ser Phe Ala 
260 265 270 

Arg Leu Ser Trp Cys Leu Gin Ser He Leu Phe Val Leu Pro Asn Gly 
275 280 285 

Gin Ala His Lys Pro Ser Gly Ala Arg Val Pro He Ser Leu Val Glu 
15 290 295 300 

Gin Leu Ser Leu Ala Met His Trp Thr Trp Tyr Leu Ala Thr Met Phe 
305 



10 



25 



35 



40 



55 



60 



310 ' 315 320 



20 Leu Phe He Lys Aep Pro Val Asn Met Leu Val Tyr Phe Leu Val Ser 

325 330 335 



Gin Ala Val Cys Gly Asn Leu Leu Ala He Val Phe Ser Leu Asn His 
340 345 350 

Asn Gly Met Pro Val He Ser Lys Glu Glu Ala Val Asp Met Asp Phe 
355 360 365 

Phe Thr Lys Gin He He Thr Gly Arg Asp Val His Pro Gly Leu Phe 
30 370 375 380 

Ala Asn Trp Phe Thr Gly Gly Leu Asn Tyr Gin He Glu His His Leu 
3S5 390 395 



Phe Pro Ser Met Pro Arg His Asn Phe Ser Lys He Gin Pro Ala Val 
405 410 4" 

Glu Thr Leu Cys Lys Lys Tyr Asn Val Arg Tyr His Thr Thr Gly Met 
420 425 430 

He Glu Gly Thr Ala Glu Val Phe Ser Arg Leu Asn Glu Val Ser Lys 
435 440 445 



45 Ala Ala Ser Lys Met Gly Lys Ala Gin 

450 455 

(2) INFORMATION FOR SEQ ID NO: 5: 

50 (i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 446 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 



Met Ala Ala Gin He Lys Lys Tyr He Thr Ser Asp Glu Leu Lys Asn 



10 



15 
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His Asp Lys Pro Gly Asp Leu Trp lie Ser He Gin Gly Lys Ala Tyr 
20 25 30 

ASP Val Ser Asp Trp Val Lys Asp His Pro Gly Gly Ser Phe Pro Leu 
35 40 45 

Lys ser Leu Ala Gly Gin Glu Val Thr Asp Ala Phe Val Ala Phe His 
50 55 60 

Pro Ala Ser Thr Trp Lys Asn Leu Asp Lys Phe Phe Thr Gly Tyr Tyr 
65 70 75 ao 

Leu Lys Asp Tyr Ser Val Ser Glu Val Ser Lys Val Tyr Arg Lys Leu 
15 85 90 95 

Val Phe Glu Phe Ser Lys Met Gly Leu Tyr Asp Lys Lys Gly His He 
100 105 110 

20 Met Phe Ala Thr Leu Cys Phe He Ala Met Leu Phe Ala Met Ser Val 

115 120 125 

Tyr Gly Val Leu Phe Cys Glu Gly Val Leu Val His Leu Phe Ser Gly 
130 135 140 

Cys Leu Met Gly Phe Leu Trp He Gin Ser Gly Trp He Gly His Asp 
lis 150 155 160 

Ala Gly His Tyr Met Val Val Ser Asp Ser Arg Leu Asn Lys Phe Met 
30 165 170 175 

Gly He Phe Ala Ala Asn Cys Leu Ser Gly He Ser He Gly Trp Trp 
180 185 190 

35 Lys Trp Asn His Asn Ala His His He Ala Cys Asn Ser Leu Glu Tyr 

^ 195 200 205 

ASP Pro ASP Leu Gin Tyr He Pro Phe Leu Val Val Ser Ser Lys Phe 
210 215 220 

Phe Gly ser Leu Thr Ser His Phe Tyr Glu Lys Arg Leu Thr Phe Asp 
225 230 235 240 



25 



40 



Ser Leu Ser Arg Phe Phe Val Ser Tyr Gin His Trp Thr Phe Tyr Pro 
45 245 250 255 

He Met Cys Ala Ala Arg Leu Asn Met Tyr Val Gin Ser Leu He Met 
260 265 270 

50 Leu Leu Thr Lys Arg Asn Val Ser Tyr Arg Ala Gin Glu Leu Leu Gly 

275 280 285 



55 



cys Leu Val Phe Ser He Trp Tyr Pro Leu Leu Val Ser Cys Leu Pro 
290 295 300 

Asn Trp Gly Glu Arg He Met Phe Val He Ala Ser Leu Ser Val Thr 
305 310 315 320 



Gly Met Gin Gin Val Gin Phe Ser Leu Asn His Phe Ser Ser Ser Val 
60 325 330 335 

Tyr val Gly Lys Pro Lys Gly Asn Asn Trp Phe Glu Lys Gin Thr Asp 
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340 



345 350 



Gly Thr Leu Asp lie Ser Cys Pro Pro Trp Met Asp Trp Phe Hie Gly 
355 360 

^ Gly Leu Gin Phe Gin lie Glu Hie Hie Leu Phe Pro Lys Met Pro Arg 

370 375 380 

Cya Asn Leu Arg Lys He Ser Pro Tyr Val He Glu Leu Cys Lys Lys 
10 385 390 395 400 

His Asn Leu Pro Tyr Asn Tyr Ala Ser Phe Ser Lys Ala Asn Glu Met 
405 410 415 

15 Thr Leu Arg Thr Leu Arg Asn Thr Ala Leu Gin Ala Arg Asp He Thr 

420 425 430 



20 



30 



45 



60 



Lys Pro Leu Pro Lys Asn Leu Val Trp Glu Ala Leu His Thr 
435 440 445 

(2) INFORMATION FOR SEQ ID NO; 6: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 359 amino acids 
25 (B) TYPE: amino acid 

(C) STRAMDEDNESS : not relevant 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 



Met Leu Thr Ala Glu Arg He Lys Phe Thr Gin Lys Arg Gly Phe Arg 
35 1 5 10 15 

Arg Val Leu Asn Gin Arg Val Asp Ala Tyr Phe Ala Glu His Gly Leu 
20 25 30 

40 Thr Gin Arg Asp Asn Pro Ser Met Tyr Leu Lys Thr Leu He He Val 

35 40 45 



Leu Trp Leu Phe Ser Ala Trp Ala Phe Val Leu Phe Ala Pro Val He 
50 55 €0 

Phe Pro Val Arg Leu Leu Gly Cys Met Val Leu Ala He Ala Leu Ala 
65 70 75 80 



Ala Phe Ser Phe Asn Val Gly His Asp Ala Asn His Asn Ala Tyr Ser 
50 85 90 9S 

Ser Asn Pro His He Asn Arg Val Leu Gly Met Thr Tyr Asp Phe Val 
100 105 110 

55 Gly Leu Ser Ser Phe Leu Trp Arg Tyr Arg His Asn Tyr Leu His His 

115 120 125 



Thr Tyr Thr Asn He Leu Gly His Asp Val Glu He His Gly Asp Gly 
130 135 140 

Ala Val Arg Met Ser Pro Glu Gin Glu His Val Gly He Tyr Arg Phe 
145 150 155 160 
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10 



25 



40 



50 



Gin Gin Phe Tyr lie Trp Gly Leu Tyr Leu Phe He Pro Phe Tyr Trp 
165 170 175 

Phe Leu Tyr Asp Val Tyr Leu Val Leu Asn Lye Gly Lys Tyr His Asp 
180 185 ISO 

His Lya He Pro Pro Phe Gin Pro Leu Glu Leu Ala Ser Leu Leu Gly 
195 200 205 

He Lys Leu Leu Trp Leu Gly Tyr Val Phe Gly Leu Pro Leu Ala Leu 
210 215 220 



Gly Phe ser He Pro Glu Val Leu He Gly Ala Ser Val Thr Tyr Met 
15 225 230 235 240 

Thr Tyr Gly He Val Val Cys Thr He Phe Met Leu Ala His Val Leu 
245 250 255 

20 Glu Ser Thr Glu Phe Leu Thr Pro Asp Gly Glu Ser Gly Ala He Asp 

260 265 270 



Asp Glu Trp Ala He Cys Gin He Arg Thr Thr Ala Asn Phe Ala Thr 
275 280 285 

Asn Asn Pro Phe Trp Asn Trp Phe Cys Gly Gly Leu Asn His Gin Val 
290 295 300 



Thr His His Leu Phe Pro. Asn He Cys His He His Tyr Pro Gin Leu 
30 305 310 315 320 

Glu Asn He He Lys Asp Val Cys Gin Glu Phe Gly Val Glu Tyr Lys 
325 330 335 

35 Val Tyr Pro Thr Phe Lys Ala Ala He Ala Ser Asn Tyr Arg Trp Leu 

340 345 350 



Glu Ala Met Gly Lys Ala Ser 

355 

(2) INFORMATION FOR SEQ ID NO : 7 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 365 amino acids 
45 (B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 



Met Thr Ser Thr Thr Ser Lys Val Thr Phe Gly Lys Ser He Gly Phe 
55 1 5 10 15 

Arg Lys Glu I^u Asn Arg Arg Val Asn Ala Tyr Leu Glu Ala Glu Asn 
20 25 30 

60 He Ser Pro Arg Asp Asn Pro Pro Met Tyr Leu Lys Thr Ala He He 

35 40 45 
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Leu Ala Trp Val Val Ser Ala Trp Thr Phe Val Val Phe Gly Pro Asp 
50 55 60 

val Leu Trp Met Lys Leu Leu Gly Cys lie Val Leu Gly Phe Gly Val 
5 65 70 75 80 

Ser Ala Val Gly Phe Asn lie Ser His Asp Gly Asn His Gly Gly Tyr 
B5 90 95 

10 Ser Lys Tyr Gin Trp Val Asn Tyr Leu Ser Gly Leu Thr His Asp Ala 

100 105 110 



15 



30 



45 



60 



He Gly Val Ser Ser Tyr Leu Trp Lys Phe Arg His Asn Val Leu His 
115 120 125 

His Thr Tyr Thr Asn He Leu Gly His Asp Val Glu He His Gly Asp 
130 135 140 



Glu Leu Val Arg Met Ser Pro Ser Met Glu Tyr Arg Trp Tyr His Arg 
20 145 150 155 160 

Tyr Gin His Trp Phe He Trp Phe Val Tyr Pro Phe He Pro Tyr Tyr 
165 170 175 

25 Trp ser He Ala Asp Val Gin Thr Met Leu Phe Lys Arg Gin Tyr His 

180 185 190 



Asp His Glu He Pro Ser Pro Thr Trp Val Asp He Ala Thr Leu Leu 
195 200 205 

Ala Phe Lys Ala Phe Gly Val Ala Val Phe Leu He He Pro He Ala 
210 215 220 



Val Gly Tyr Ser Pro Leu Glu Ala Val He Gly Ala Ser He Val Tyr 
35 225 230 235 240 

Met Thr His Gly Leu Val Ala Cys Val Val Phe Met Leu Ala His Val 
245 250 255 

40 He Glu Pro Ala Glu Phe Leu Asp Pro Asp Asn Leu His He Asp Asp 

260 265 270 



Glu Trp Ala He Ala Gin Val Lys Thr Thr Val Asp Phe Ala Pro Asn 
275 280 285 

Asn Thr He He Asn Trp Tyr Val Gly Gly Leu Asn Tyr Gin Thr Val 
290 295 300 



His His Leu Phe Pro His He Cys His He His Tyr Pro Lys He Ala 
50 305 310 315 320 

Pro He Leu Ala Glu Val Cys Glu Glu Phe Gly Val Asn Tyr Ala Val 
325 330 335 

55 His Gin Thr Phe Phe Gly Ala Leu Ala Ala Asn Tyr Ser Trp Leu Lys 

340 345 350 



Lys Met Ser He Asn Pro Glu Thr Lys Ala He Glu Gin 
355 360 365 

(2) INFORMATION FOR SBQ ID NO: 8: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) liENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Synthetic oligonucleotide" 



10 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 21 

(D) OTHER INFORMATION: /number = 1 
15 /note» "N«Inosine or Cytoaine" 



( ix) FEATURE : 

(A) NAME/KEY: misc_f eature 

(B) LOCATION: 27 

20 (D> OTHER INFORMATION: /numbers 2 

/note« "Nolnosine or Cytoeine" 



25 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
CUACUACUAC UACAYCAYAC NTAYACNAAY AT 32 



(2) INFORMATION FOR SEQ ID NO: 9: 



30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

35 

(ii) MOLECULE TYPE: other nucleic acxd 

(A) DESCRIPTION: /desc = "Synthetic oligonucleotide" 



40 (ix) FEATURE: 

(A) NAME/KEY: misc_f eature 

(B) LOCATION: 13 

(D) OTHER INFORMATION: /number= 
/notes "N=Inosine or Cytoeine" 

45 

(ix) FEATURE: 

(A) NAME/KEY: raisc^feature 

(B) LOCATION: 19 ~ 

(D) OTHER INFORMATION: /number = 
50 /note= "N=InoBine or Cytosine" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

55 CAUCAUCAUC AUNGGRAANA RRTGRTO 27 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 
60 (A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE; other nucleic acid 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
CCAAGCTTCT GCAGGAGCTC TTTTTTTTTT TTTTT 35 
10 (2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

15 (C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 



20 



25 



35 



(ii) MOIiECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

His Xaa Xaa His His 
1 5 

(2) INFORMATION FOR SEQ ID NO: 12: 



(1) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 5 amino acids 
30 (B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



Gin Xaa Xaa His His 

40 1 5 



45 



55 



(2) INFORMATION FOR SEQ ID NO: 13: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 746 nucleic acids 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 
50 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: nucleic acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 



60 



CGTATGTCAC TCCATTCCAA ACTCGTTCAT GGTATCATAA ATATCAACAC ATTTACGCTC 

CACTCCTCTA TGGTATTTAC ACJVCTrCAAAT ATCGTACTCA AGATTGOGAA GCrTTTTOTAA 120 

AGGATGGTAA AAATGGTGCA ATTCGTGTTA GTGTCGCCAC AAATTTCGAT AAGGCOGCTT 180 

ACGTCATTGG TAAATTGTCT TTTGTTTTCT TCCGTTTCAT CCTTCCACTC CGTTATCATA 240 

60 GCTTTACAGA TTTAATTTGT TATTTCCTCA TTGCT6AATT CGT CTTTG GT TGOTATCTCa^ 300 

CAATTAATTT CCAAGTTAGT CATGTCGCTG AAGATCTCAA ATTCTTTTGCT ACCCCT(aAAA 360 

GACCAGATGA ACCATCTCAA ATCAATGAAG ATTGGCSCAAT CCTTCAACTT AAAACTACTC 420 



-102- 



wo 98/46765 



PCT/US98/07422 



AAGATTATGG TCATGGTTCA CTCCTTTGTA CCTTTTTTAG TGGTTCTTTA AATCATCAAG 480 

TTGTTCATCA TTTATTCCCA TCAATTGCTC AAGATTTCTA CCCACAACTT GTACCAATTG 540 

TAAAAGAAGT TTGTAAAGAA CATAACATTA CTTACCACAT TAAACCAAAC TTCACTGAAG 600 

CTATTATGTC ACACATTAAT TACCTTTACA AAATGGGTAA TGATCC3W3AT TATQT TAAAA 660 

5 T^CCATTAGC CTCAAAAGAT GATTAAATGA AATAACTTAA AAACCAATTA TTTACTTTTG 720 

ACAAACAGTA ATATTAATAA ATACAA '^^^ 



10 



(2) INFORMATION FOR SEQ ID NO: 14: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 227 amino acida 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 
15 (D) TOPOLOGY: linear 



20 



(ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 



Tyr Val Thr Pro Phe Gin Thr Arg Ser Trp Tyr His Lys Tyr Gin 
1 5 10 15 

His lie Tyr Ala Pro Leu Leu Tyr Gly He Tyr Thr Leu Lys Tyr 
20 25 30 

25 Arg Thr Gin Asp Trp Glu Ala Phe Val Lys Asp Gly Lys Asn Gly 

35 40 45 

Ala He Arg Val Ser Val Ala Thr Asn Phe Asp Lys Ala Ala Tyr 
50 55 60 

Val He Gly Lys Leu Ser Phe Val Phe Phe Arg Phe He Leu Pro 
30 65 70 75 

Leu Arg Tyr His Ser Phe Thr Asp Leu He Cys Tyr Phe Leu He 
80 85 90 

Ala Glu Phe Val Phe Gly Trp Tyr Leu Thr He Asn Phe Gin Val 
95 100 105 

35 Ser His Val Ala Glu Asp Leu Lys Phe Phe Ala Thr Pro Glu Arg 

110 115 120 

Pro Asp Glu Pro Ser Gin He Asn Glu Asp Trp Ala He Leu Gin 
125 130 135 

Leu Lys Thr Thr Gin Asp Tyr Gly His Gly Ser Leu Leu Cys Thr 
40 140 145 150 

Phe Phe Ser Gly Ser Leu Asn His Gin Val Val His His Leu Phe 
155 160 165 

Pro Ser He Ala Gin Asp Phe Tyr Pro Gin Leu Val Pro He Val 
170 175 180 

45 Lys Glu Val Cys Lys Glu His Asn He Thr Tyr His He Lys Pro 

185 190 195 

Asn Phe Thr Glu Ala He Met Ser His He Asn Tyr Leu Tyr Lys 
200 205 210 

Met Gly Asn Asp Pro Asp Tyr Val Lys Lys Pro Leu Ala Ser Lys 
50 215 220 225 

Asp Asp *** 



55 



(2) INFORMATION FOR SEQ ID NO: 15: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 494 nucleic acids 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS: not relevant 
60 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: nucleic acid 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 



5 TTTTGGAAGG NTCCAAGTTN ACCACGGANT NGGCAAGTTN ACGGGGCGGA AANCGGTTTT 60 

CCCCCCAAGC CTTTTGTCGA CTGGTTCTGT GGTGGCTTCC AGTACCAAGT CGACCACCAC 120 

TTATTCCCCA GCCTGCCCCG ACACAATCTG GCCAAGACAC ACGCACTGGT CGAATCGTTC 180 

TGCAAGGAGT GGGGTGTCCA GTACCACGAA GCCGACCTCG TGGACGGGAC CATGGAAGTC 240 

TTGCACCATT TGGGCAGCGT GGCCGGCGAA TTCGTCGTGG ATTTTGTACG CGACGGACCC 300 

10 GCCATGTAAT CGTCGTTCGT GACGATGCAA GGGTTCACGC ACATCTACAC ACACTCACTC 360 

ACACAACTAG TGTAACTCGT ATAGAATTCG GTGTCGACCT GGACCTTGTT TGACTGGTTG 420 

GGGATAGGGT AGGTAGGCGG ACGCGTGGGT CGNCCCCGGG AATTCTGTGA CCGGTACCTG 480 
GCCCGCGTNA AAGT 



15 



45 



55 



(2) INFORMATION FOR SEQ ID NO: 16: 



(1) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 87 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Phe Trp Lys Xxx Pro Ser Xxx Pro Arg Xxx Xxx Gin Val Xxx Gly 

30 1 s 10 15 

Ala Glu Xxx Gly Phe Pro Pro Lys Pro Phe Val Asp Trp Phe Cys 

20 25 30 

Gly Gly Phe Gin Tyr Gin Val Asp His His Leu Phe Pro Ser Leu 

35 40 45 

35 Pro Arg His Asn Leu Ala Lys Thr His Ala Leu Val Glu Ser Phe 

50 55 60 

Cys Lys Glu Trp Gly Val Gin Tyr His Glu Ala Asp Leu Val Asp 

65 70 75 

Gly Thr Met Glu Val Leu His His Leu Gly Ser Val Ala Gly Glu 
40 65 70 75 

Phe Val Val Asp Phe Val Arg Asp Gly Pro Ala Met 

80 85 



(2) INFORMATION FOR SEQ ID NO: 17: 



(i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 520 nucleic acids 
50 (B) TYPE: amino acid 

<C) STRANDEDNESS: not relevant 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: nucleic acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 



494 



GGATGGAGTT CGTCTGGATC GCTGTGCGCT AOGCGACGTG GTTTAAGCGT CATGGGTGCG 60 

60 CTTGGGTACA CGCCCSGGGCA GTCGTTGGGC ATGTACTTGT GCGCCTTTGG TCTOGGCTGC 120 

ATTTACATTT TTCTGCAGTT CGCCGTAAGT CACACCCATT TGCCCGTGAG CAACCCGGAG 180 

GATCAGCTGC ATTGGCTCGA GTACGCGCGG ACCACACTGT GAACATCAGC ACCAAGTCGT 240 
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GGTTTGTCAC ATGGTGGATG TCGAACCTCA ACTTTCAGAT OGAOCACCAC CTTTTCCCCA 300 

CGGCGCCCCA GTTCCGTTTC AAGGAGATCA GCCCGCGCGT CGAGGCCCTC TTCAAGCGCC 360 

ACGGTCTCCC TTACTACGAC ATGCCCTACA CGAGCGCCX3T CTCCACCACC TTTGC(»ACC 420 

TCTACTCCGT CGGCCATTCC GTCX3GCGACG CCAAGOOCGA CTAGCCTCTT TTCCTAGACC 480 

5 TTAATTCCCC ACCCCACCCC ATGTTCTGTC TTCCTCCCGC 520 

(2) INFORMATION FOR SEQ ID NO: 18: 

10 (i) SEQtJENCE CHARACTERISTICS: 

(A) LENGTH: 153 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 



15 



20 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 



Met Glu Phe Val Trp He Ala Val Arg Tyr Ala Thr Trp Phe Lys 
1 5 10 15 

Arg His Gly Cys Ala Trp Val His Ala Gly Ala Val Val Gly His 
20 25 30 

25 Val Leu Val Arg Leu Trp Ser Arg Leu His Leu His Phe Ser Ala 

35 40 45 

Val Arg Arg Lys Ser His Pro Phe Ala Arg Glu Gin Pro Gly Gly 
50 55 €0 

Ser Ala Ala Leu Ala Arg Val Arg Ala Asp His Thr Val Asn He 
30 65 70 75 

Ser Thr Lys Ser Trp Phe Val Thr Trp Trp Met Ser Asn Leu Asn 
80 85 90 

Phe Gin He Glu His His Leu Phe Pro Thr Ala Pro Gin Phe Arg 
95 100 105 

35 Phe Lys Glu He Ser Pro Arg Val Glu Ala Leu Phe Lys Arg His 

110 115 120 

Gly Leu Pro Tyr Tyr Asp Met Pro Tyr Thr Ser Ala Val Ser Thr 
125 130 135 

Thr Phe Ala Asn Leu Tyr Ser Val Gly His Ser Val Gly Asp Ala 
40 140 145 150 

Lys Arg Asp 



45 (2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 429 nucleic acids 

(B) TYPE: nucleic acid 

50 (C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: nucleic acid 

55 (xi) SEQUENCE DBStHlIPTION: SEQ ID NO: 19: 



AOGCGTCGGC CCACGCGTCC GCCGCiSAGCA ACTCATCAAG GAAGGCTACT TTGACCCCTC 60 

GCTCCCGCAC ATGAOQTACC GCX5TGGTCX3A GATTOTTGTT CTCTTCGTGC TTTCCTTTTG 120 

60 GCrrGATGGGT CAGTCTTCAC CCCTCGCGCT CGCTCTCGGC ATTGTOGTCA GCGGCATCTC ISO 

TCAGGGTC3GC TGCGOCTGGG TAATGCATGA GATGGGCCAT GGGTCGTTCSV CTGGTGTCAT 240 

TTGGCTTGAC GACCG6TTGT GCGAGTTCTT TTACGGCGTT GGTTGTGGCA TGAGCGGTCA 300 
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TTACTGGAAA AACCAGCACA GCAAACACCA CGCAGCGCCA AACCGGCTCG AGCACX3ATGT 36a 
AOATCTCAAC ACCTTGCCAT TGGTGGCCTT CAACGAGCXSC GTCGTGCX5CA AGGTCCGACC 420 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 125 amino acida 
10 (B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 



50 



55 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20 



15 



20 



25 



30 



35 



40 (2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1219 base pairs 

(B) TYPE: nucleic acid 
45 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



Arg 


Val Arg 


Pro Arg 


Val 


Arg 


Arg Glu Gin Leu He 


Lys Glu Gly 


1 






5 






10 


15 


Tyr 


Phe Asp 


Pro 


Ser 


Leu 


Pro 


His Met Thr Tyr Arg 


Val Val Glu 




20 






25 


30 


lie 


Val Val 


Leu 


Phe 


Val 


Leu 


Ser Phe Trp Leu Met 


Gly Gin Ser 








35 






40 


45 


Ser 


Pro Leu 


Ala 


Leu 


Ala 


Leu 


Gly He Val Val Ser 


Gly He Ser 








50 






55 


60 


Gin 


Gly Arg 


Cys 


Gly 


Trp 


Val 


Met His Glu Met Gly 


His Gly Ser 








65 






70 


75 


Phe 


Thr Gly 


Val 


He 


Trp 


Leu 


Asp Asp Arg Leu Cys 


Glu Phe Phe 






65 






70 


75 


Tyr 


Gly Val 


Gly 


Cys 


Gly 


Met 


Ser Gly His Tyr Trp 


Lys Asn Gin 




80 






85 


90 


His 


Ser Lys 


His 


His 


Ala 


Ala 


Pro Asn Arg Leu Glu 


His Asp Val 






95 






100 


105 


Asp 


Leu Asn 


Thr 


Leu 


Pro 


Leu 


Val Ala Phe Asn Glu 


Arg Val Val 






110 






115 


120 


Arg 


Lys Val 


Arg 


Pro 
















125 











(ii) MOLECULE TYPE: other nucleic acid (Edited Contig 2692004) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 



GCACGCCGAC CGGCGCCGGG AGATCCTGGC AAAGTATCCA GAGAT/VAAGT CCTTGATGAA 60 

ACCTGATCCC AATTTGATAT GGATTATAAT TATGATGGTT CTCACCCAGT TGGGTGCATT 120 

TTACATAGTA AAAGACTTGG ACTGGAAATG GGTCATATTT GGGGCCTATG CGTTTGGCAG 180 

60 TTGCATTAAC CACTCAATGA CTCTGGCTAT TCATGAGATT GCCCACAATG CTGCCTTTGG 240 

CAACTGCAAA GCAATGTGGA ATCGCTGGTT TGGAATGTTT GCTAATCTTC CTATTCSGGAT 300 
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TCCATATTCA 


ATTTCCTTTA 


AGAGGTATCA 


CATGGATCAT 


CATCGGTACC 


TTGGAGCTGA 


360 


TGGCGTCGAT 


GTAGATATTC 


CTACCGATTT 


TGAGGGCTGG 


TTCTTCTGTA 


CCGCTTTCAG 


420 


AAAGTTTATA 


TGGGTTATTC 


TTCAGCCTCT 


CTTTTATGCC 


TTTCGACCTC 


TGTTCATCAA 


480 


CCCCAAACCA 


ATTACGTATC 


TGGAAGTTAT 


CAATACCGTG 


GCACAGGTCA 


CTTTTGACAT 


540 


TTTAATTTAT 


TACTTTTTGG 


GAATTAAATC 


CTTAGTCTAC 


ATGTTGGCAG 


CATCTTTACT 


600 


TGGCCTGGGT 


TTGCACCCAA 


TTTCTGGACA 


TTTTATAGCT 


GAGCATTACA 


TGTTCTTAAA 


660 


GGGTCATGAA 


ACTTACTCAT 


ATTATGGGCC 


TCTGAATTTA 


CTTACCTTCA 


ATGTGGGTTA 


720 


TCATAATGAA 


CATCATGATT 


TCCCCAACAT 


TCCTGGAAAA 


AGTCTTCCAC 


TGGTGAGGAA 


780 


AATAGCAGCT 


GAATACTATG 


ACAACCTCCC 


TCACTACAAT 


TCCTGGATAA 


AAGTACTGTA 


840 


TGATTTTGTG 


ATGGATGATA 


CAATAAGTCC 


CTACTCAAGA 


ATGAAGAGGC 


ACCAAAAAGG 


900 


AGAGATGGTG 


CTGGAGTAAA 


TATCATTAGT 


GCCAAAGGGA 


TTCTTCTCCA 


AAACTTTAGA 


960 


TGATAAAATG 


GAATTTTTGC 


ATTATTAAAC 


TTGAGACCAG 


TGATGCTCAG 


AAGCTCCCCT 


1020 


GGCACAATTT 


CAGAGTAAGA 


GCTCGGTGAT 


ACCAAGAAGT 


GAATCTGGCT 


TTTAAACAGT 


1080 


CAGCCTGACT 


CTGTACTGCT 


CAGTTTCACT 


CACAGGAAAC 


TTGTGACTTG 


TGTATTATCG 


1140 


TCATTGAGGA 


TGTTTCACTC 


ATGTCTGTCA 


TTTTATAAGC 


ATATCATTTA 


AAAAGCTTCT 


1200 


AAAAAGCTAT 


TTCGCCAGG 










1219 



35 



40 



45 



(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 655 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid (Edited Contig 2153526) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: 



50 



55 



60 



65 



TTACCTTCTA 
GGGCCTTTTC 
GAACCATATT 
CCAGGCCACA 
CTTCCAGATT 
TCCCCTGGTG 
GTCAGCCTTC 
CTATCTTCAC 
GGAGCCAAGG 



CGTCCGCTTC 
TTCATAGTCA 
CCCATGCACA 
TGCAATGTCC 
GAGCACCATC 
CAGTCCTTGT 
GCCGACATCA 
CAATAACAAC 
CAGAGGGGAG 



TTCCTCACTT 
GGTTCCTGGA 
TTGATCATGA 
ACAAGTCTGC 
TTTTTCCCAC 
GTGCCAAGCA 
TCCACTCACT 
AGCCACCCTG 
CTTGAGGGAC 



ATGTGCCACT 
AAGCAACTGG 
CCGGAACATG 
CTTCAATGAC 
GATGCCTCGA 
TGGCATAGAG 
AAAGGAGTCA 
CCCAGTCTGG 
AATGCCACTA 



ATTGGGGCTG 
TTTGTGTGGG 
GACTGGGTTT 
TGGTTCAGTG 
CACAATTACC 
TACCAGTCCA 
GGGCAGCTCT 
AAGAAGAGGA 
TAGTTTAATA 



AAAGCTTCCT 60 

TGACACAGAT 120 

CCACCCAGCT 180 

GACACCTCAA 240 

ACAAAGTGGC 300 

AGCCCCTGCT 360 

GGCTAGATGC 420 

GGAAGACTCT 480 

CTCAGAGGGG 540 
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GTTGGGTTTG GGGACATAAA GCCTCTGACT CAAACTCCTC CCTTTTATCT TCTAGCCACA 600 
GTTCTAAGAC CCAAAGTGGG GGGTGGACAC AGAAGTCCCT AGGAGGGAAG GAGCT 655 

5 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 304 base pairs 
10 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid (Edited Contig 3506132) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

GTCTTTTACT TTGGCAATGG CTGGATTCCT ACCCTCATCA CGGCCTTTGT CCTTGCTACC 60 

TCTCAGGCCC AAGCTGGATG GCTGCAACAT GATTATGGCC ACCTGTCTGT CTACAGAAAA 120 

CCCAAGTGGA ACCACCTTGT CCACAAATTC GTCATTGGCC ACTTAAAGGG TGCCTCTGCC 180 

AACTGGTGGA ATCATCGCCA CTTCCAGCAC CACGCCAAGC CTAACATCTT CCACAAGGAT 240 

CCCGATGTGA ACATGCTGCA CGTGTTTGTT CTGGGCGAAT GGCAGCCCAT CGAGTACGGC 300 
AAGA 



15 



20 
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60 
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(2) INFORMATION FOR SEQ ID NO: 24; 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 918 base pairs 
35 (B> TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid (Edited Contig 3854933) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: 

CAGGGACCTA CCCCGCGCTA CTTCACCTGG GACGAGGTGG CCCAGCGCTC AGGGTGCGAG 60 

45 GAGCGGTGGC TAGTGATCGA CCGTAAGGTG TACAACATCA GCGAGTTCAC CCGCCGGCAT 120 

CCAGGGGGCT CCCGGGTCAT CAGCCACTAC GCCGGGCAGG ATGCCACGGA TCCCTTTGTG 180 

GCCTTCCACA TCAACAAGGG CCTTGTGAAG AAGTATATGA ACTCTCTCCT GATTGGAGAA 240 

CTGTCTCCAG AGCAGCCCAG CTTTGAGCCC ACCAAGAATA AAGAGCTGAC AGATGAGTTC 300 

CGGGAGCTGC GGGCCACAGT GGAGCGGATG GGGCTCATGA AGGCCAACCA TGTCTTCTTC 360 

55 CTGCTGTACC TGCTGCACAT CTTGCTGCTG GATGGTGCAG CCTGGCTCAC CCTTTGGGTC 420 

TTTGGGACGT CCTTTTTGCC CTTCCTCCTC TGTGCGGTGC TGCTCAGTGC AGTTCAGGCC 480 

CAGGCTGGCT GGCTGCAGCA TGACTTTGGG CACCTGTCGG TCTTCAGCAC CTCAAAGTGG 540 

AACCATCTGC TACATCATTT TGTGATTGGC CACCTGAAGG GGGCCCCCGC CAGTTGGTGG 600 

AACCACATGC ACTTCCAGCA CCATGCCAAG CCCAACTGCT TCCGCAAAGA CCCAGACATC 660 
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AACATGCATC CCTTCTTCTT TGCCTTGGGG AAGATCCTCT CTGTGGAGCT TGGGAAACAG 720 

AAGAAAAAAT ATATGCCGTA CAACCACCAG CACARATACT TCTTCCTAAT TGGGCCCCCA 780 

GCCTTGCTGC CTCTCTACTT CCAGTGGTAT ATTTTCTATT TTGTTATCCA GCGAAAGAAG 840 

TGGGTGGACT TGGCCTGGAT CAGCAAACAG GAATACGATG AAGCCGGGCT TCCATTGTCC 900 
ACCGCAAATG CTTCTAAA 

(2) INFORMATION FOR SEQ ID NO: 25: 



918 



(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 168 6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: other nucleic acid (Edited Contig 2511785) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

25 GCCACTTAAA GGGTGCCTCT GCCAACTGGT GGAATCATCG CCACTTCCAG CACCACGCCA 60 

AGCCTAACAT CTTCCACAAG GATCCCGATG TGAACATGCT GCACGTGTTT GTTCTGGGCG 120 

AATGGCAGCC CATCGAGTAC GGCAAGAAGA AGCTGAAATA CCTGCCCTAC AATCACCAGC 180 

ACGAATACTT CTTCCTGATT GGGCCGCCGC TGCTCATCCC CATGTATTTC CAGTACCAGA 240 

TCATCATGAC CATGATCGTC CATAAGAACT GGGTGGACCT GGCCTGGGCC GTCAGCTACT 300 

35 ACATCCGGTT CTTCATCACC TACATCCCTT TCTACGGCAT CCTGGGAGCC CTCCTTTTCC 360 

TCAACTTCAT CAGGTTCCTG GAGAGCCACT GGTTTGTGTG GGTCACACAG ATGAATCACA 420 

TCGTCATGGA GATTGACCAG GAGGCCTACC GTGACTGGTT CAGTAGCCAG CTGACAGCCA 480 

CCTGCAACGT GGAGCAGTCC TTCTTCAACG ACTGGTTCAG TGGACACCTT AACTTCCAGA 54 0 

TTGAGCACCA CCTCTTCCCC ACCATGCCCC GGCACAACTT ACACAAGATC GCCCCGCTGG 600 

45 TGAAGTCTCT ATGTGCCAAG CATGGCATTG AATACCAGGA GAAGCCGCTA CTGAGGGCCC 660 

TGCTGGACAT CATCAGGTCC CTGAAGAAGT CTGGGAAGCT GTGGCTGGAC GCCTACCTTC 7 20 

ACAAATGAAG CCACAGCCCC CGGGACACCG TGGGGAAGGG GTGCAGGTGG GGTGATGGCC 7 80 

AGAGGAATGA TGGGCTTTTG TTCTGAGGGG TGTCCGAGAG GCTGGTGTAT GCACTGCTCA 84 0 

CGGACCCCAT GTTGGATCTT TCTCCCTTTC TCCTCTCCTT TTTCTCTTCA CATCTCCCCC 900 

55 ATAGCACCCT GCCCTCATGG GACCTGCCCT CCCTCAGCCG TCAGCCATCA GCCATGGCCC 960 

TCCCAGTGCC TCCTAGCCCC TTCTTCCAAG GAGCAGAGAG GTGGCCACCG GGGGTGGCTC 102 0 

TGTCCTACCT CCACTCTCTG CCCCTAAAGA TGGGAGGAGA CCAGCGGTCC ATGGGTCTGG 1080 

CCTGTGAGTC TCCCCTTGCA GCCTGGTCAC TAGGCATCAC CCCCGCTTTG GTTCTTCAGA 1140 

TGCTCTTGGG GTTCATAGGG GCAGGTCCTA GTCGGGCAGG GCCCCTGACC CTCCCGGCCT 1200 

65 GGCTTCACTC TCCCTGACGG CTGCCATTGG TCCACCCTTT CATAGAGAGG CCTGCTTTGT 1260 

-109. 
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TACAAAGCTC GGGTCTCCCT CCTGCAGCTC GGTTAAGTAC CCGAGGCCTC TCTTAAGATG 1320 

TCCAGGGCCC CAGGCCCGCG GGCACAGCCA GCCCAAACCT TGGGCCCTGG AAGAGTCCTC 1380 

^ CACCCCATCA CTAGAGTGCT CTGACCCTGG GCTTTCACGG GCCCCATTCC ACCGCCTCCC 1440 

CAACTTGAGC CTGTGACCTT GGGACCAAAG GGGGAGTCCC TCGTCTCTTG TGACTCAGCA 1500 

10 GAGGCAGTGG CCACGTTCAG GGAGGGGCCG GCTGGCCTGG AGGCTCAGCC CACCCTCCAG 1560 

CTTTTCCTCA GGGTGTCCTG AGGTCCAAGA TTCTGGAGCA ATCTGACCCT TCTCCAAAGG 1620 

CTCTGTTATC AGCTGGGCAG TGCCAGCCAA TCCCTGGCCA TTTGGCCCCA GGGGACGTGG 1680 

1686 



15 



20 
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GCCCTG 



(2) INFORMATION FOR SEQ ID NO: 26: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1843 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : single 

25 <D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: other nucleic acid (Contig 2535) 
(xi) SEQUENCE DESCRIPTION: SEQ ID n6:26: 



GTCTTTTACT 


TTGGCAATGG 


CTGGATTCCT 


ACCCTCATCA CGGCCTTTGT 


CCTTGCTACC 


60 


TCTCAGGCCC 


AAGCTGGATG 


GCTGCAACAT 


GATTATGGCC ACCTGTCTGT 


CTACAGAAAA 


120 


CCCAAGTGGA 


ACCACCTTGT 


CCACAAATTC 


GTCATTGGCC ACTTAAAGGG 


TGCCTCTGCC 


180 


AACTGGTGGA 


ATCATCGCCA 


CTTCCAGCAC 


CACGCCAAGC CTAACATCTT 


CCACAAGGAT 


240 


CCCGATGTGA 


ACATGCTGCA 


CGTGTTTGTT 


CTGGGCGAAT GGCAGCCCAT 


CGAGTACGGC 


300 


AAGAAGAAGC 


TGAAATACCT 


GCCCTACAAT 


CACCAGCACG AATACTTCTT 


CCTGATTGGG 


360 


CCGCC6CTGC 


TCATCCCCAT 


GTATTTCCAG 


TACCAGATCA TCATGACCAT 


GATCGTCCAT 


420 


AAGAACTGGG 


TGGACCTGGC 


CTGGGCCGTC 


AGCTACTACA TCCGGTTCTT 


CATCACCTAC 


480 


ATCCCTTTCT 


ACGGCATCCT 


GGGAGCCCTC 


CTTTTCCTCA ACTTCATCAG 


GTTCCTGGAG 


540 


AGCCACTGGT 


TTGTGTGGGT 


CACACAGATG 


AATCACATCG TCATGGAGAT 


TGACCAGGAG 


600 


GCCTACCGTG 


ACTGGTTCAG 


TAGCCAGCTG 


ACAGCCACCT GCAACGTGGA 


GCAGTCCTTC 


660 


TTCAACGACT 


GGTTCAGTGG 


ACACCTTAAC 


TTCCAGATTG AGCACCACCT 


CTTCCCCACC 


720 


ATGCCCCGGC 


ACAACTTACA 


CAAGATCGCC 


CCGCTGGTGA AGTCTCTATG 


TGCCAAGCAT 


780 


GGCATTGAAT 


ACCAGGAGAA 


GCCGCTACTG 


AGGGCCCTGC TGGACATCAT 


CAGGTCCCTG 


840 


AAGAAGTCT6 


GGAAGCTGTG 


GCTGGACGCC 


TACCTTCACA AATGAAGCCA 


CAGCCCCCGG 


900 


GACACCGTGG 


GGAAGG6GTG 


CAGGTGGGGT 


GATGGCCAGA GGAATGATGG 


GCTTTTGTTC 


960 


TGAGGGGTGT 


CCGAGAGGCT 


GGTGTATGCA 


CTGCTCACGG ACCCCATGTT 


GGATCTTTCT 


1020 
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CCCTTTCTCC TCTCCTTTTT CTCTTCACAT CTCCCCCATA GCACCCTGCC CTCATGGGAC 1080 

CTGCCCTCCC TCAGCCGTCA GCCATCAGCC ATGGCCCTCC CAGTGCCTCC TAGCCCCTTC 1140 

TTCCAAGGAG CAGAGAGGTG GCCACCGGGG GTGGCTCTGT CCTACCTCCA CTCTCTGCCC 1200 

CTAAAGATGG GAGGAGACCA GCGGTCCATG GGTCTGGCCT GTGAGTCTCC CCTTGCAGCC 1260 

TGGTCACTAG GCATCACCCC CGCTTTGGTT CTTCAGATGC TCTTGGGGTT CATAGGGGCA 1320 

GGTCCTAGTC GGGCAGGGCC CCTGACCCTC CCGGCCTGGC TTCACTCTCC CTGACGGCTG 1380 

CCATTGGTCC ACCCTTTCAT AGAGAGGCCT GCTTTGTTAC AAAGCTCGGG TCTCCCTCCT 14 40 

15 GCAGCTCGGT TAAGTACCCG AGGCCTCTCT TAAGATGTCC AGGGCCCCAG GCCCGCGGGC 1500 

ACAGCCAGCC CAAACCTTGG GCCCTGGAAG AGTCCTCCAC CCCATCACTA GAGTGCTCTG 1560 

ACCCTGGGCT TTCACGGGCC CCATTCCACC GCCTCCCCAA CTTGAGCCTG TGACCTTGGG 1620 

ACCAAAGGGG GAGTCCCTCG TCTCTTGTGA CTCAGCAGAG GCAGTGGCCA CGTTCAGGGA 1680 

GGGGCCGGCT GGCCTGGAGG CTCAGCCCAC CCTCCAGCTT TTCCTCAGGG TGTCCTGAGG 17 40 

25 TCCAAGATTC TGGAGCAATC TGACCCTTCT CCAAAGGCTC TGTTATCAGC TGGGCAGTGC 1800 

CAGCCAATCC CTGGCCATTT GGCCCCAGGG GACGTGGGCC CTG 1^43 

30 (2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2257 base pairs 

(B) TYPE: nucleic acid 
35 (C) STRANDEDNESS : single 

( D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: other nucleic acid (Edited Contig 253538a) 
40 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:27: 

CAGGGACCTA CCCCGCGCTA CTTCACCTGG GACGAGGTGG CCCAGCGCTC AGGGTGCGAG 60 

GAGCGGTGGC TAGTGATCGA CCGTAAGGTG TACAACATCA GCGAGTTCAC CCGCCGGCAT 120 

CCAGGGGGCT CCCGGGTCAT CAGCCACTAC GCCGGGCAGG ATGCCACGGA TCCCTTTGTG 180 

GCCTTCCACA TCAACAAGGG CCTTGTGAAG AAGTATATGA ACTCTCTCCT GATTGGAGAA 240 

50 CTGTCTCCAG AGCAGCCCAG CTTTGAGCCC ACCAAGAATA AAGAGCTGAC AGATGAGTTC 300 

CGGGAGCTGC GGGCCACAGT GGAGCGGATG GGGCTCAT6A AGGCCAACCA TGTCTTCTTC 360 

CTGCTGTACC TGCTGCACAT CTTGCTGCTG GATGGTGCAG CCTGGCTCAC CCTTTGGGTC 420 

TTTGGGACGT CCTTTTTGCC CTTCCTCCTC TGTGCGGTGC TGCTCAGTGC AGTTCAGCAG 480 

GCCCAAGCTG GATGGCTGCA ACATGATTAT GGCCACCTGT CTGTCTACAG AAAACCCAAG 540 

60 TGGAACCACC TTGTCCACAA ATTCGTCATT GGCCACTTAA AGGGTGCCTC TGCCAACTGG 600 

TGGAATCATC GCCACTTCCA GCACCACGCC AAGCCTAACA TCTTCCACAA GGATCCCGAT 660 

GTGAACATGC TGCACGTGTT TGTTCTGGGC GAATGGCAGC CCATCGAGTA CGGCAAGAAG 720 
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55 



65 
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AAGCTGAAAT ACCTGCCCTA CAATCACCAG CACGAATACT TCTTCCTGAT TGGGCCGCCG 780 

CTGCTCATCC CCATGTATTT CCAGTACCAG ATCATCATGA CCATGATCGT CCATT^GAAC 840 

TGGGTGGACC TGGCCTGGGC CGTCAGCTAC TACATCCGGT TCTTCATCAC CTACATCCCT 900 

TTCTACGGCA TCCTGGGAGC CCTCCTTTTC CTCAACTTCA TCAGGTTCCT GGAGAGCCAC 960 

TGGTTTGTGT GGGTCACACA GATGAATCAC ATCGTCATGG AGATTGACCA GGAGGCCTAC 1020 

CGTGACTGGT TCAGTAGCCA GCTGACAGCC ACCTGCAACG TGGAGCAGTC CTTCTTCAAC 1080 

GACTGGTTCA GTGGACACCT TAACTTCCAG ATTGAGCACC ACCTCTTCCC CACCATGCCC 114 0 

15 CGGCACAACT TACACAAGAT CGCCCCGCTG GTGAAGTCTC TATGTGCCAA GCATGGCATT 1200 

GAATACCAGG AGAAGCCGCT ACTGAGGGCC CTGCTGGACA TCATCAGGTC CCTGAAGAAG 1260 

TCTGGGAAGC TGTGGCTGGA CGCCTACCTT CACAAATGAA GCCACAGCCC CCGGGACACC 1320 

GTGGGGAAGG GGTGCAGGTG GGGTGATGGC CAGAGGAATG ATGGGCTTTT GTTCTGAGGG 1380 

GTGTCCGAGA GGCTGGTGTA TGCACTGCTC ACGGACCCCA TGTTGGATCT TTCTCCCTTT 14 40 

25 CTCCTCTCCT TTTTCTCTTC ACATCTCCCC CATAGCACCC TGCCCTCATG GGACCTGCCC 1500 

TCCCTCAGCC GTCAGCCATC AGCCATGGCC CTCCCAGTGC CTCCTAGCCC CTTCTTCCAA 1560 

GGAGCAGAGA GGTGGCCACC GGGGGTGGCT CTGTCCTACC TCCACTCTCT GCCCCTAAAG 1620 

ATGGGAGGAG ACCAGCGGTC CATGGGTCTG GCCTGTGAGT CTCCCCTTGC AGCCTGGTCA 1680 

CTAGGCATCA CCCCCGCTTT GGTTCTTCAG ATGCTCTTGG GGTTCATAGG GGCAGGTCCT 1740 

35 AGTCGGGCAG GGCCCCTGAC CCTCCCGGCC TGGCTTCACT CTCCCTGACG GCTGCCATTG 1800 

GTCCACCCTT TCATAGAGAG GCCTGCTTTG TTACAAAGCT CGGGTCTCCC TCCTGCAGCT I860 

CGGTTAAGTA CCCGAGGCCT CTCTTAAGAT GTCCAGGGCC CCAGGCCCGC GGGCACAGCC 1920 

AGCCCAAACC TTGGGCCCTG GAAGAGTCCT CCACCCCATC ACTAGAGTGC TCTGACCCTG 1980 

GGCTTTCACG GGCCCCATTC CACCGCCTCC CCAACTTGAG CCTGTGACCT TGGGACCAAA 2040 

45 GGGGGAGTCC CTCGTCTCTT GTGACTCAGC AGAGGCAGTG GCCACGTTCA GGGAGGGGCC 2100 

GGCTGGCCTG GAGGCTCAGC CCACCCTCCA GCTTTTCCTC AGGGTGTCCT GAGGTCCAAG 2160 

ATTCTGGAGC AATCTGACCC TTCTCCAAAG GCTCTGTTAT CAGCTGGGCA GTGCCAGCCA 2220 

ATCCCTGGCC ATTTGGCCCC AGGGGACGTG GGCCCTG 2257 



30 



40 



50 



55 



65 



(2) INFORMATION FOR SEQ ID NO: 28: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 411 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
60 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: amino acid (Translation of Contig 2692004) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
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His Ala Asp Arg Arg Arg Glu lie Leu Ala Lys Tyr Pro Glu He 
15 10 IS 

Lys Ser Leu Met Lys Pro Asp Pro Asn Leu He Trp He He He 
5 20 25 30 

Met Met Val Leu Thr Gin Leu Gly Ala Phe Tyr He Val Lys Asp 
35 40 45 

Leu Asp Trp Lys Trp Val He Phe Gly Ala Tyr Ala Phe Gly Ser 
50 55 60 

10 Cys He Asn His Ser Met Thr Leu Ala He His Glu He Ala His 

65 70 "75 

Asn Ala Ala Phe Gly Asn Cys Lys Ala Met Trp Asn Arg Trp Phe 
80 85 90 

Gly Met Phe Ala Asn Leu Pro He Gly He Pro Tyr Ser He Ser 
15 95 100 105 

Phe Lys Arg Tyr His Met Asp His His Arg Tyr Leu Gly Ala Asp 
110 115 120 

Gly Val Asp Val Asp He Pro Thr Asp Phe Glu Gly Trp Phe Phe 
125 130 135 

20 Cys Thr Ala Phe Arg Lys Phe He Trp Val He Leu Gin Pro Leu 

140 145 150 

Phe Tyr Ala Phe Arg Pro Leu Phe He Asn Pro Lys Pro He Thr 
155 160 165 

Tyr Leu Glu Val He Asn Thr Val Ala Gin Val Thr Phe Asp He 
25 170 175 180 

Leu He Tyr Tyr Phe Leu Gly He Lys Ser Leu Val Tyr Met Leu 
185 190 195 

Ala Ala Ser Leu Leu Gly Leu Gly Leu His Pro He Ser Gly His 
200 205 210 

30 Phe He Ala Glu His Tyr Met Phe Leu Lys Gly His Glu Thr Tyr 

215 220 225 

Ser Tyr Tyr Gly Pro Leu Asn Leu Leu Thr Phe Asn Val Gly Tyr 
230 235 240 

His Asn Glu His His Asp Phe Pro Asn He Pro Gly Lys Ser Leu 
35 245 250 255 

Pro Leu Val Arg Lys He Ala Ala Glu Tyr Tyr Asp Asn Leu Pro 
260 265 270 

His Tyr Asn Ser Trp He Lys Val Leu Tyr Asp Phe Val Met Asp 
275 280 285 

40 Asp Thr He Ser Pro Tyr Ser Arg Met Lys Arg His Gin Lys Gly 

290 295 300 

Glu Met Val Leu Glu *** He Ser Leu Val Pro Lys Gly Phe Phe 
305 310 315 

Ser Lys Thr Leu Asp Asp Lys Met Glu Phe Leu His Tyr Thr 
45 320 325 330 

*** Asp Gin *** Cys Ser Glu Ala Pro Leu Ala Gin Phe Gin Ser 
335 340 345 

Lys Ser Ser Val He Pro Arg Ser Glu Ser Gly Phe *** Thr Val 
350 355 360 

50 Ser Leu Thr Leu Tyr Cys Ser Val Ser Leu Thr Gly Asn Leu ♦** 

365 370 375 

Leu Val Tyr Tyr Arg His *** Gly Cys Phe Thr His Val Cys His 
380 385 390 

Phe He Ser He Ser Phe Lys Lys Leu Leu Lys Ser Tyr Phe Ala 
55 400 405 410 

Arg 

(2) INFORMATION FOR SEQ ID NO: 29: 

60 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 218 amino acids 

(B) TYPE: amino acid 



-113- 



wo 98/46765 



PCT/US98/07422 



10 



15 



20 



25 



30 



35 



40 



(C) STRAMDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: amino acid (Translation of Contig 2153526) 
<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29; 



Tyr 
1 


Leu 


Leu 


Arg 


Titers 




Leu 


Pro 


His 


Leu 
10 


Cvs 


Ala 


Thr 


He 


Gly 
15 


Ala 


Glu 


Ser 


Pne 


Leu 


uxy 


Leu 


Phe 


Phe 


He 


Val 


Ara 


Phe 


Leu 


Glu 


















25 










30 


Ser 


Asn 


Trp 


Pne 


va± 


Trp 


1 

V aX 




Gin 


Met 


Asn 


His 


He 


Pro 


Met 
















40 










45 


His 


lie 


Asp 


HXS 


Asp 


Arg 


Asn 




Asp 




Val 


Ser 


Thr 


Gin 


Leu 








5U 










55 










60 


Gin 


Ala 


Thr 


Cys 


Asn 


V d X 


His 


Lys 


Ser 


Ala 


Phe 


Asn 


Asp 


Trp 


Phe 






65 










70 










75 


Ser 


Gly 


His 


Leu 


Asn 




Gin 


He 


Glu 


His 


His 


Leu 


Phe 


Pro 


Thr 








80 










85 










90 


Met 


Pro 


Arg 


His 


Asn 


Tyr 


His 


Lys 


val 


Ala 


Pro 


Leu 


Val 


Gin 


Ser 








95 










100 










105 


Leu 


Cys 


Ala 


Lys 


His 


Gly 


He 


Glu 


Tyr 


Gin 


Ser 


Lys 


Pro 


Leu 


Leu 






110 










115 










120 


Ser 


Ala 


Phe 


Ala 


Asp 


He 


He 


His 


Ser 


Leu. 


Lys 


Glu 


Ser 


Gly 


Gin 








125 










130 










135 


Leu 


Trp 


Leu 


Asp 


Ala 


Tyr 


Leu 


His 


Gin 


1* * * 


Gin 


Gin 


Pro 


Pro 


Cys 






140 










145 










150 


Pro 


Val 


Trp 


Lys 


Lys 


Arg 


Arg 


Lys 


Thr 


Leu 


Glu 


Pro 


Arg 


Gin 


Arg 






155 










160 










165 


Gly 


Ala 


*** 


Gly 


Thr 


Met 


Pro 


Leu 




Phe 


Asn 


Thr 


Gin 


Arg 


Gly 






170 










175 










180 


Leu 


Gly 


Leu 


Gly 


Thr 




Ser 


Leu 




Leu 


Lys 


Leu 


Leu 


Pro 


Phe 






185 










190 










195 


He 


Phe 


* * * 


Pro 


Gin 


Phe 


♦ ★ * 


Asp 


Pro 


Lys 


Trp 


Gly 


Val 


Asp 


Thr 








200 








205 










210 


Glu 


Val 


Pro 


Arg 


Arg 
215 


Glu 


Gly 


Ala 

















(2) IKFORMATION FOR SEQ ID NO: 30: 



(i) SEQUENCE CHARACTERISTICS: 
45 (A) LENGTH: 71 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

50 (ii) MOLECULE TYPE: amino acid (Translation of Contig 3506132) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 



Val Phe Tyr Phe Gly Asn Gly Trp He Pro Thr Leu He Thr Ala 

1 5 10 15 

Phe Val Leu Ala Thr Ser Gin Ala Gin Ala Gly Trp Leu Gin His 

20 25 30 

60 Asp Tyr Gly His Leu Ser Val Tyr Arg Lys Pro Lys Trp Asn His 

35 40 45 

Leu Val His Lys Phe Val He Gly His Leu Lys Gly Ala Ser Ala 
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50 55 60 

Asn Trp Trp Asn His Arg His Phe Gin His His Ala Lys Pro Asn 

65 "70 75 

Leu Gly Glu Trp Gin Pro lie Glu Tyr Gly Lys Xxx 

5 80 85 



10 



(2) INFORMATION FOR SEQ ID NO: 31: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 306 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
15 (D) TOPOLOGY: linear 



20 



(ii) MOLECULE TYPE: amino acid (Translation of Contig 3854933) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 



Gin Gly Pro Thr Pro Arg Tyr Phe Thr Trp Asp Glu Val Ala Gin 
15 10 15 

Arg Ser Gly Cys Glu Glu Arg Trp Leu Val He Asp Arg Lys Val 
25 20 25 30 

Tvr Asn He Ser Glu Phe Thr Arg Arg His Pro Gly Gly Ser Arg 
35 40 45 

Val He Ser His Tyr Ala Gly Gin Asp Ala Thr Asp Pro Phe Val 
50 55 60 

30 Ala Phe His He Asn Lys Gly Leu Val Lys Lys Tyr Met Asn Ser 

65 70 75 

Leu Leu He Gly Glu Leu Ser Pro Glu Gin Pro Ser Phe Glu Pro 
80 85 90 

Thr Lys Asn Lys Glu Leu Thr Asp Glu Phe Arg Glu Leu Arg Ala 
35 95 100 105 

Thr Val Glu Arg Met Gly Leu Met Lys Ala Asn His Val Phe Phe 
110 115 120 

Leu Leu Tyr Leu Leu His He Leu Leu Leu Asp Gly Ala Ala Trp 
125 130 135 

40 Leu Thr Leu Trp Val Phe Gly Thr Ser Phe Leu Pro Phe Leu Leu 

140 145 150 

Cys Ala Val Leu Leu Ser Ala Val Gin Ala Gin Ala Gly Trp Leu 
155 160 165 

Gin His Asp Phe Gly His Leu Ser Val Phe Ser Thr Ser Lys Trp 
45 170 175 180 

Asn His Leu Leu His His Phe Val He Gly His Leu Lys Gly Ala 
185 190 195 

Pro Ala Ser Trp Trp Asn His Met His Phe Gin His His Ala Lys 
200 205 210 

50 Pro Asn Cys Phe Arg Lys Asp Pro Asp He Asn Met His Pro Phe 

215 220 225 

Phe Phe Ala Leu Gly Lys He Leu Ser Val Glu Leu Gly Lys Gin 
230 235 240 

Lys Lys Lys Tyr Met Pro Tyr Asn His Gin His Xxx Tyr Phe Phe 
55 245 250 255 

Leu He Gly Pro Pro Ala Leu Leu Pro Leu Tyr Phe Gin Trp Tyr 
260 265 270 

He Phe Tyr Phe Val He Gin Arg Lys Lys Trp Val Asp Leu Ala 
275 280 285 

60 Trp He Ser Lys Gin Glu Tyr Asp Glu Ala Gly Leu Pro Leu Ser 

290 295 300 

Thr Ala Asn Ala Ser Lys 
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305 



(2) INFORMATION FOR SEQ ID NO: 32: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 566 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
10 (D) TOPOLOGY: linear 



15 



(ii) MOLECULE TYPE: amino acid (Translation of Contig 2511785) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

His Leu Lys Gly Ala Ser Ala Asn Trp Trp Asn His Arg His Phe 

e 1 n 15 



1 O *v 

Gin His His Ala Lys Pro Asn He Phe His Lys Asp Pro Asp Val 

20 25 30 

Asn Met Leu His Val Phe Val Leu Gly Glu Trp Gin Pro He Glu 

35 40 45 



5 10 15 

/s 

20 20 

Asn Met Leu His Val 
35 

Tyr Gly Lys Lys Lys Leu Lys Tyr Leu Pro Tyr Asn His Gin His 
50 55 60 

25 Glu Tyr Phe Phe Leu He Gly Pro Pro Leu Leu He Pro Met Tyr 

65 70 75 

Phe Gin Tyr Gin He He Met Thr Met He Val His Lys Asn Trp 
80 85 90 

Val Asp Leu Ala Trp Ala Val Ser Tyr Tyr He Arg Phe Phe He 
30 95 100 105 

Thr Tyr He Pro Phe Tyr Gly He Leu Gly Ala Leu Leu Phe Leu 
110 115 120 

Asn Phe He Arg Phe Leu Glu Ser His Trp Phe Val Trp Val Thr 
125 130 135 

35 Gin Met Asn His He Val Met Glu He Asp Gin Glu Ala Tyr Arg 

140 145 150 

Asp Trp Phe Ser Ser Gin Leu Thr Ala Thr Cys Asn Val Glu Gin 
155 160 165 

Ser Phe Phe Asn Asp Trp Phe Ser Gly His Leu Asn Phe Gin He 
40 170 175 180 

Glu His His Leu Phe Pro Thr Met Pro Arg His Asn Leu His Lys 
185 190 195 

He Ala Pro Leu Val Lys Ser Leu Cys Ala Lys His Gly He Glu 
200 205 210 

45 Tyr Gin Glu Lys Pro Leu Leu Arg Ala Leu Leu Asp He He Arg 

215 220 225 

Ser Leu Lys Lys Ser Gly Lys Leu Trp Leu Asp Ala Tyr Leu His 
230 235 240 

Lys *** Ser His Ser Pro Arg Asp Thr Val Gly Lys Gly Cys Arg 
50 245 250 255 

Trp Gly Asp Gly Gin Arg Asn Asp Gly Leu Leu Phe *** Gly Val 
260 265 270 

Ser Glu Arg Leu Val Tyr Ala Leu Leu Thr Asp Pro Met Leu Asp 
275 280 285 

55 Leu Ser Pro Phe Leu Leu Ser Phe Phe Ser Ser His Leu Pro His 

290 295 300 

Ser Thr Leu Pro Ser Trp Asp Leu Pro Ser Leu Ser Arg Gin Pro 
305 310 315 

Ser Ala Met Ala Leu Pro Val Pro Pro Ser Pro Phe Phe Gin Gly 
60 320 325 330 

Ala Glu Arg Trp Pro Pro Gly Val Ala Leu Ser Tyr Leu His Ser 
335 340 345 
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Leu Pro Leu Lys Met Gly Gly Asp Gin Arg Ser Met Gly Leu Ala 
350 355 360 

CVS Glu Ser Pro Leu Ala Ala Trp Ser Leu Gly He Thr Pro Ala 
365 370 375 

5 Leu Val Leu Gin Met Leu Leu Gly Phe He Gly Ala Gly Pro Ser 

380 385 390 

Ara Ala Gly Pro Leu Thr Leu Pro Ala Trp Leu His Ser Pro ♦** 
400 405 410 

Arq Leu Pro Leu Val His Pro Phe He Glu Arg Pro Ala Leu Leu 
10 415 420 425 

Gin Ser Ser Gly Leu Pro Pro Ala Ala Arg Leu Ser Thr Arg Gly 
430 435 440 

Leu Ser *** Asp Val Gin Gly Pro Arg Pro Ala Gly Thr Ala Ser 
445 450 455 

15 Pro Asn Leu Gly Pro Trp Lys Ser Pro Pro Pro His His *** Ser 

460 465 470 

Ala Leu Thr Leu Gly Phe His Gly Pro His Ser Thr Ala Ser Pro 
475 480 485 

Thr *** Ala Cys Asp Leu Gly Thr Lys Gly Gly Val Pro Arg Leu 
20 490 495 500 

Leu *** Leu Ser Arg Gly Ser Gly His Val Gin Gly Gly Ala Gly 
505 510 515 

Trp Pro Gly Gly Ser Ala His Pro Pro Ala Phe Pro Gin Gly Val 
520 525 530 

25 Leu Arg Ser Lys He Leu Glu Gin Ser Asp Pro Ser Pro Lys Ala 

535 540 545 

Leu Leu Ser Ala Gly Gin Cys Gin Pro He Pro Gly His Leu Ala 
550 555 560 

Pro Gly Asp Val Gly Pro Xxx 
30 565 



(2) INFORMATION FOR SEQ ID NO: 33: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 619 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

40 

(ii) MOLECULE TYPE: amino acid (Translation of Contig 2535) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

45 



Val 


Phe 


Tyr 


Phe 


Gly 


Asn 


Gly 


Trp 


He 


Pro 


Thr 


Leu 


He 


Thr 


Ala 


1 






5 










10 










15 


Phe 


Val 


Leu 


Ala 


Thr 


Ser 


Gin 


Ala 


Gin 


Ala 


Gly 


Trp 


Leu 


Gin 


His 








20 










25 










30 


Asp 


Tyr 


Gly 


His 


Leu 


Ser 


Val 


Tyr 


Arg 


Lys 


Pro 


Lys 


Trp 


Asn 


His 




35 










40 










45 


Leu 


Val 


His 


Lys 


Phe 


Val 


He 


Gly 


His 


Leu 


Lys 


Gly 


Ala 


Ser 


Ala 








50 










55 










60 


Asn 


Trp 


Trp 


Asn 


His 


Arg 


His 


Phe 


Gin 


His 


His 


Ala 


Lys 


Pro 


Asn 






65 










70 










75 


He 


Phe 


His 


Lys 


Asp 


Pro 


Asp 


Val 


Asn 


Met 


Leu 


His 


Val 


Phe 


Val 








80 










85 










90 


Leu 


Gly 


Glu 


Trp 


Gin 


Pro 


He 


Glu 


Tyr 


Gly 


Lys 


Lys 


Lys 


Leu 


Lys 






95 










100 










105 


Tyr 


Leu 


Pro 


Tyr 


Asn 


His 


Gin 


His 


Glu 


Tyr 


Phe 


Phe 


Leu 


He 


Gly 






110 










115 










120 



-117- 



Pro Pro Leu Leu He Pro Met Tyr Phe Gin Tyr Gin He He Met 
125 130 135 

Thr Met He Val His Lys Asn Trp Val Asp Leu Ala Trp Ala Val 
140 145 150 

Ser Tvr Tyr He Arg Phe Phe He Thr Tyr He Pro Phe Tyr Gly 
155 160 165 

He Leu Gly Ala Leu Leu Phe Leu Asn Phe He Arg Phe Leu Glu 
170 175 180 

Ser His Trp Phe Val Trp Val Thr Gin Met Asn His He Val Met 
185 IM 195 

Glu He Asp Gin Glu Ala Tyr Arg Asp Trp Phe Ser Ser Gin Leu 
200 205 210 

Thr Ala Thr Cys Asn Val Glu Gin Ser Phe Phe Asn Asp Trp Phe 
215 220 225 

Ser Glv His Leu Asn Phe Gin He Glu His His Leu Phe Pro Thr 
230 235 240 

Met Pro Arg His Asn Leu His Lys He Ala Pro Leu Val Lys Ser 
245 250 255 

Leu Cys Ala Lys His Gly He Glu Tyr Gin Glu Lys Pro Leu Leu 
260 265 270 

Ara Ala Leu Leu Asp He He Arg Ser Leu Lys Lys Ser Gly Lys 
275 280 285 

Leu Trp Leu Asp Ala Tyr Leu His Lys *** Ser His Ser Pro Arg 
290 295 300 

ASP Thr Val Gly Lys Gly Cys Arg Trp Gly Asp Gly Gin Arg Asn 
305 310 315 

Asp Glv Leu Leu Phe *** Gly Val Ser Glu Arg Leu Val Tyr Ala 
^ 320 325 - 330 

Leu Leu Thr Asp Pro Met Leu Asp Leu Ser Pro Phe Leu Leu Ser 
335 340 345 

Phe Phe Ser Ser His Leu Pro His Ser Thr Leu Pro Ser Trp Asp 
350 355 360 

Leu Pro Ser Leu Ser Arg Gin Pro Ser Ala Met Ala Leu Pro Val 
365 370 375 



Pro 


Pro 


Ser 


Pro 


Phe 


Phe 


Gin 


Gly Ala 


Glu , 


Arg 


Trp 


Pro 


Pro Gly 








380 










385 








390 


Val 


Ala 


Leu 


Ser 


Tyr 


Leu 


His 


Ser 


Leu 


Pro 


Leu 


Lys 


Met 


Gly Gly 








400 










405 








410 


Asp 


Gin 


Arg 


Ser 


Met 
415 


Gly 


Leu 


Ala 


Cys 


Glu 
420 


Ser 


Pro 


Leu 


Ala Ala 
425 


Trp 


Ser 


Leu 


Gly 


He 


Thr 


Pro 


Ala 


Leu Val 


Leu 


Gin 


Met 


Leu Leu 






430 










435 








440 


Gly 


Phe 


He 


Gly 


Ala 


Gly 


Pro 


Ser 


Arg 


Ala 


Gly 


Pro 


Leu 


Thr Leu 






445 










450 








455 


Pro 


Ala 


Trp 


Leu 


His 


Ser 


Pro 


*★ * 


Arg 


Leu 


Pro 


Leu 


Val 


His Pro 






460 










465 








470 


Phe 


He 


Glu 


Arg 


Pro 


Ala 


Leu 


Leu 


Gin 


Ser 


Ser 


Gly 


Leu 


Pro Pro 






475 










480 








485 


Ala 


Ala 


Arg 


Leu 


Ser 


Thr 


Arg 


Gly 


Leu 


Ser 


* * ★ 


Asp 


Val 


Gin Gly 








490 










495 








500 


Pro 


Arg 


Pro 


Ala 


Gly 


Thr 


Ala 


Ser 


Pro 


Asn 


Leu 


Gly 


Pro 


Trp Lys 






505 










510 








515 


Ser 


Pro 


Pro 


Pro 


His 


His 


** * 


Ser 


Ala 


Leu 


Thr 


Leu 


Gly 


Phe His 






520 










525 








530 


Gly 


Pro 


His 


Ser 


Thr 


Ala 


Ser 


Pro 


Thr 


*** 


Ala 


Cys 


Asp 


Leu Gly 








535 










540 








545 


Thr 


Lys 


Gly 


Gly 


Val 


Pro 


Arg 


Leu 


Leu 


*** 


Leu 


Ser 


Arg 


Gly Ser 






550 










555 








560 


Glv His 


Val 


Gin 


Gly 


Gly 


Ala 


Gly 


Trp 


Pro 


Gly 


Gly 


Ser 


Ala His 










565 










570 








575 


Pro 


Pro 


Ala 


Phe 


Pro 


Gin 


Gly 


Val 


Leu 


Arg 


Ser 


Lys 


He 


Leu Glu 






580 








585 








590 
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Gin Ser Asp Pro Ser Pro Lys Ala Leu Leu Ser Ala Gly Gin Cys 

595 600 

Gin Pro He Pro Gly His Leu Ala Pro Gly Asp Val Gly Pro Xxx 

610 ^20 



(2) INFORMATION FOR SEQ ID NO: 34: 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 757 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



15 



20 



(ii) MOLECULE TYPE: amino acid (Translation of Contig 253538a) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:34: 

Gin Gly Pro Thr Pro Arg Tyr Phe Thr Trp Asp Glu Val Ala Gin 
' 10 15 



1 5 10 

Arg Ser Gly Cys Glu Glu Arg Trp Leu Val He Asp Arg Lys Val 



20 





25 Tyr Asn He Ser Glu Phe Thr Arg Arg His Pro Gly Gly Ser Arg 

35 40 45 

Val He Ser His Tyr Ala Gly Gin Asp Ala Thr Asp Pro Phe Val 
50 55 60 

Ala Phe His He Asn Lys Gly Leu Val Lys Lys Tyr Met Asn Ser 
30 65 70 75 

Leu Leu He Gly Glu Leu Ser Pro Glu Gin Pro Ser Phe Glu Pro 
80 85 90 

Thr Lys Asn Lys Glu Leu Thr Asp Glu Phe Arg Glu Leu Arg Ala 
95 ' 100 105 

35 Thr Val Glu Arg Met Gly Leu Met Lys Ala Asn His Val Phe Phe 

110 115 120 

Leu Leu Tyr Leu Leu His He Leu Leu Leu Asp Gly Ala Ala Trp 
125 130 135 

Leu Thr Leu Trp Val Phe Gly Thr Ser Phe Leu Pro Phe Leu Leu 
40 140 145 150 

Cys Ala Val Leu Leu Ser Ala Val Gin Gin Ala Gin Ala Gly Trp 
155 160 165 

Leu Gin His Asp Tyr Gly His Leu Ser Val Tyr Arg Lys Pro Lys 
170 175 180 

45 Trp Asn His Leu Val His Lys Phe Val He Gly His Leu Lys Gly 

185 190 195 

Ala Ser Ala Asn Trp Trp Asn His Arg His Phe Gin His His Ala 
200 205 210 

Lys Pro Asn He Phe His Lys Asp Pro Asp Val Asn Met Leu His 
50 215 220 225 

Val Phe Val Leu Gly Glu Trp Gin Pro He Glu Tyr Gly Lys Lys 
230 235 240 

Lys Leu Lys Tyr Leu Pro Tyr Asn His Gin His Glu Tyr Phe Phe 
245 250 255 

55 Leu He Gly Pro Pro Leu Leu He Pro Met Tyr Phe Gin Tyr Gin 

260 265 270 

He He Met Thr Met He Val His Lys Asn Trp Val Asp Leu Ala 
275 280 285 

Trp Ala Val Ser Tyr Tyr He Arg Phe Phe He Thr Tyr He Pro 
60 290 295 300 

Phe Tyr Gly He Leu Gly Ala Leu Leu Phe Leu Asn Phe He Arg 
305 310 315 
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Phe Leu Glu Ser His Trp Phe Val Trp Val Thr Gin Met Asn His 
320 325 330 

He val Met Glu He Asp Gin Glu Ala Tyr Arg Asp Trp Phe Ser 
335 340 345 

5 Ser Gin Leu Thr Ala Thr Cys Asn Val Glu Gin Ser Phe Phe Asn 

350 355 360 

ASP Trp Phe Ser Gly His Leu Asn Phe Gin lie Glu His His Leu 
365 370 375 

Phe Pro Thr Met Pro Arg His Asn Leu His Lys He Ala Pro Leu 
10 380 385 390 

Val Lys Ser Leu Cys Ala Lys His Gly He Glu Tyr Gin Glu Lys 
400 405 410 

Pro Leu Leu Arg Ala Leu Leu Asp He He Arg Ser Leu Lys Lys 
415 420 425 

15 Ser GlY Lys Leu Trp Leu Asp Ala Tyr Leu His Lys *** Ser His 

430 435 440 

Ser Pro Arg Asp Thr Val Gly Lys Gly Cys Arg Trp Gly Asp Gly 
445 450 455 

Gin Arg Asn Asp Gly Leu Leu Phe *** Gly Val Ser Glu Arg Leu 
20 460 465 470 

Val Tyr Ala Leu Leu Thr Asp Pro Met Leu Asp Leu Ser Pro Phe 
475 480 485 

Leu Leu Ser Phe Phe Ser Ser His Leu Pro His Ser Thr Leu Pro 
490 495 500 

25 Ser Trp Asp Leu Pro Ser Leu Ser Arg Gin Pro Ser Ala Met Ala 

505 510 515 

Leu Pro Val Pro Pro Ser Pro Phe Phe Gin Gly Ala Glu Arg Trp 
520 525 530 

Pro Pro Gly Val Ala Leu Ser Tyr Leu His Ser Leu Pro Leu Lys 
30 535 540 545 

Met Gly Gly Asp Gin Arg Ser Met Gly Leu Ala Cys Glu Ser Pro 
550 555 560 

Leu Ala Ala Trp Ser Leu Gly He Thr Pro Ala Leu Val Leu Gin 
565 570 575 

35 Met Leu Leu Gly Phe He Gly Ala Gly Pro Ser Arg Ala Gly Pro 

580 585 590 

Leu Thr Leu Pro Ala Trp Leu His Ser Pro *** Arg Leu Pro Leu 
595 600 605 

Val His Pro Phe He Glu Arg Pro Ala Leu Leu Gin Ser Ser Gly 
40 610 615 620 

Leu Pro Pro Ala Ala Arg Leu Ser Thr Arg Gly Leu Ser **♦ Asp 
625 630 635 

Val Gin Gly Pro Arg Pro Ala Gly Thr Ala Ser Pro Asn Leu Gly 
640 645 650 

45 Pro Trp Lys Ser Pro Pro Pro His His *** Ser Ala Leu Thr Leu 

655 660 665 

Gly Phe His Gly Pro His Ser Thr Ala Ser Pro Thr ♦** Ala Cys 
670 675 680 

Asp Leu Gly Thr Lys Gly Gly Val Pro Arg Leu Leu *** Leu Ser 
50 685 690 695 

Arg Gly Ser Gly His Val Gin Gly Gly Ala Gly Trp Pro Gly Gly 
700 705 *710 

Ser Ala His Pro Pro Ala Phe Pro Gin Gly Val Leu Arg Ser Lys 
715 720 725 

55 He Leu Glu Gin Ser Asp Pro Ser Pro Lys Ala Leu Leu Ser Ala 

730 735 740 

Gly Gin Cys Gin Pro He Pro Gly His Leu Ala Pro Gly Asp Val 
745 750 755 

Gly Pro Xxx 

60 
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What is claimed is : 

1 . An isolated nucleic acid comprising: 

a nucleotide sequence depicted in a SEQ ID NO. 1 

2. A polypeptide encoded by said nucleic acid of claim 1 . 

5 3 . A purified or isolated polypeptide comprising an amino acid 

sequence depicted in SEQ ID NO: 2. 

4. An isolated nucleic acid encoding the polypeptide of SEQ ID 

NO: 2. 

5. An isolated nucleic acid comprising: 

10 a nucleotide sequence which encodes a polypeptide that desaturates a 

fatty acid molecule at carbon 5 from the carboxyl end of said fatty acid 
molecule. 

6. The isoloated nucleic acid according to Claim 5, wherein said 
nucleotide sequence is derived from eukaryotic cell, 

15 7. The isolated nucleic acid according to Claim 6, wherein said 

eukaryotic cell is a fimgal cell. 

8. The isolated nucleic acid according to Claim 7, wherem said 
frmgal cell is of the genus Mortierella. 

9. The isolated nucleic acid according to Claim 8, wherein said 
20 Mortierella cell is of the species Mortierella alpina. 

10. The isolated nucleic acid according to Claim 5, wherein said 
nucleotide sequence anneals to a nucleotide sequence depicted in SEQ ID 
NO: 1. 

1 1 . The nucleic acid of claim 1 0, wherein said nucleotide sequence 
25 encodes an amino acid sequence depicted in SEQ ID NO: 2. 

12. The nucleic acid of claim 1 1, wherein said amino acid sequence 
depicted in SEQ ID NO: 2 is selected from the group consisting of amino acid 
residues 30-38, 41-44. 171-175, 203-212. and 387-394. 
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13. An isolated or ptmfied polypeptide which desaturates a fatty acid 
molecule at carbon 5 from the carboxyl end of said fatty acid molecule. 

14. An isfolated nucleic acid comprising: 

a nucleotide sequence which is substantially identical to a sequence of at 
5 least 50 nucleotides in SEQ ID NO 1 . 

1 5. An isolated nucleic acid sequence having at least about 50% 
identity to SEQ ID NO 1. 

16. A nucleic acid construct comprising: 

a nucleotide sequence depicted in a SEQ ID NO: 1 linked to a 
1 0 heterologous nucleic acid. 

17. A nucleic acid construct comprising: 

a nucleotide sequence depicted in a SEQ ID NO: 1 operably linked to a 
promoter. 

1 8. The nucleic acid construct of claim 1 7, wherein said promoter is 
1 5 functional in a microbial cell. 

19. The nucleic acid construct of claim 18, wherein said microbial 
cell is a yeast cell. 

20. The nucleic acid construct of claim 17, wherein said nucleotide 
sequence is derived from a frmgus. 

20 21 . The nucleic acid according to Claim 19, wherein said fungus is 

of the genus Mortierella. 

22. The nucleic acid according to Claim 20, wherein said fungus is 
of the species Mortierella alpina. 

23 . A nucleic acid construct comprising: 

25 a nucleotide sequence which encodes a polypeptide comprising an amino acid 

sequence which corresponds to or is complementary to an amino acid sequence 
depicted in SEQ ID NO: 2, wherein said nucleotide sequence is operably linked 
to a promoter which is functional in a host cell, wherein said nucleotide 
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sequence encodes a polypeptide which desaturates a fatty acid molecule at 
carbon 5 from the carboxyl end of a fatty acid molecule. 

24. A nucleic acid constmct comprising: 

a nucleotide sequence which encodes a functionally active A5- 
5 desaturase, said desaturase having an ammo acid sequence which corresponds 

to or is complementary to all of or a portion of an amino acid sequence depicted 
in a SEQ ID NO: 2, wherein said nucleotide sequence is operably linked to a 
promoter functional in a host cell. 

25. A recombinant yeast cell comprising: 

10 a nucleic construct according to Claim 23 or Claim 24. 

26. The recombinant yeast cell according to Claim 25, wherein said 
yeast cell is a Saccharomyces cell. 

27. A host cell comprising: 

at least one copy of a nucleotide sequence which encodes a polypeptide 
1 5 which converts dihomo-y-iinolenic acid to arachidonic acid, wherein said 

microbial cell or an ancestor of said microbial cell was transformed with a 
vector comprismg said nucleotide sequence, and wherein said nucleotide 
sequence is operably linked to a promoter functional in said host cell. 

28. The microbial cell accordmg to Claim 27, wherein said cell is a 
20 host cell selected from the group consisting of a fungal cell and an algal cell. 

29. The microbial cell according to Claim 28, wherein said fungal 
cell is a yeast cell and said algae cell is marine algal cell. 

30. The microbial cell according to Claim 27, wherein said cell is 
enriched for 20:3 fatty acids as compared to a host cell which is devoid of said 

25 nucleotide sequence. 

31. The microbial cell according to Claim 27, wherein said cell is 
enriched for 20:4 or (0-3 20:4 fatty acids as compared to a host cell which is 
devoid of said DNA sequence. 
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32. The microbial cell according to Claim 27, wherein said cell is 
enriched for 20:5 fatty acids as compared to a host cell which is devoid of said 
DNA sequence. 

33. The microbial cell according to Claim 27, wherein said cell has 
5 an altered amount of 20:3 (8. 1 1, 14) fatty acid as compared to an 

untransformed microbial cell. 

34. A method for production of arachidonic acid in a microbial cell 
culture, said method comprising: 

growing a microbial cell culture having a plurality of microbial cells, 
10 wherein said microbial cells or ancestors of said microbial cells were 

transformed with a vector comprising one or more nucleic acids having a 
nucleotide sequence which encodes a polypeptide which converts dihomo-y- 
linolenic acid to arachidonic acid, wherein said one or more nucleic acids are 
operably linked to a promoter, under conditions wherein said one or more 
15 nucleic acids are expressed and arachidonic acid is produced in said microbial 

cell culture. 

35. The method of Claim 34, wherein said polypeptide is an enzyme 
which desaturates a fatty acid molecule at carbon 5 from the carboxyl end of 
said fatty acid molecule. 
20 36. The method of Claun 34, wherein said nucleotide sequence is 

derived fiom a Mortieretta species. 

37. The method according to Claim 34, wherein said dihomo-y- 
linolenic acid is exogenously supplied. 

38. The method according to Claim 34, wherein said microbial cells 
25 are yeast cells. 

39. The method according to Claim 38, wherein said yeast cells are 
Saccharomyces species cells. 

40. The method according to Claun 34, wherein said conditions are 
inducible. 
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41 . A recombinant yeast cell which converts greater than about 5% 
of a 20:3 fatty acid to a 20:4 fatty acid. 

42. A nucleic acid probe comprising: 

a nucleotide sequence as represented by SEQ ID NO: 1 . 
5 43 . A host cell comprising: 

a nucleic acid construct according to Claim 23 or Claim 24. 

44. A host cell comprising: 

a vector which includes a nucleic acid which encodes a fatty acid 
desaturase derived from Mortierella alpina, wherein said fatty acid desaturase 
10 comprises an amino acid sequence represented by SEQ ID NO:2, wherem said 

nucleic acid is operably linked to a promoter. 

45. The host cell according to Claim 44, wherein said host cell is a 
eukaryotic cell. 

46. The host cell accordmg to Claim 45, wherein said eukaryotic cell 
15 is selected from the group consisting of a manmialian cell, a plant cell, a fungal 

cell, an avian cell and an algal cell. 

47. The host cell according to Claim 45, wherein said host cell 
contains dihomo-gamma-linolenic acid. 

48. The host cell according to Claim 45, wherein said host cell 
20 contains EPA. 

49. The host cell according to Claim 44, wherein said promoter is 
exogenously supplied. 

50. A method for desaturating a dihomo-y-linolenic acid, said 
method comprising: 

25 culturing a recombinant microbial cell according to Claim 37, imder 

conditions suitable for expression of polypeptide encoded by said nucleic acid, 
wherein said host cell further comprises a fatty acid substrate of said 
polypeptide. 
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51 . A fatty acid desaturated by the method according to Claim 50. 

52. An oil comprising a fetty acid according to Claim 5 1 . 

53. A method for obtaining altered long chain polyunsaturated fetty 
acid biosynthesis comprising the steps of: 

5 growing a microbe having cells which contain a transgene which 

encodes a transgene expression product which desaturates a fatty acid molecule 
at carbon 5 from the carboxyl end of said fatty acid molecule, wherein said 
trangene is operably associated with an expression control sequence, under 
conditions whereby said transgene is expressed, whereby long chain 

1 0 polyunsaturated f&tXy acid biosynthesis in said cells is altered. 

54. A method for obtaining altered long chain polyunsaturated fatty 
acid biosynthesis comprising the steps of: 

growing a microbe having cells which contain a transgene, derived from 
a fungus or algae, vdiich encodes a transgene expression product which 
1 5 desaturates a fatty acid molecule at carbon 5 from the carboxyl end of said fetty 
acid molecule, wherein said trangene is operably associated with an expression 
control sequmce, under conditions vAneteby said one or more transgenes is 
expressed, whereby long chain polyunsaturated fetty acid biosynthesis m said 
cells is altered. 

20 55. The method according to claims 53 or 54, wherein said long 

chain polyunsaturated firtty acid is selected from the gtoup consisting of ARA, 
DGLA and EPA. 

56. A microbal oil or fraction thereof produced according to the 
method of claims 53 or 54. 

25 57. A method of treating or preventing malnutrition comprisii^ 

administering said microbal oil of claim 56 to a patient in need of said treatment 
or prevention in an amount sufGcient to effect said treatment or prevenUon. 

58. A pharmaceutical composition comprising said microbal oil or 
fraction of claim 56 and a pharmaceutically acceptable carrier. 
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59. The pharmaceutical composition of claim 58, wherein said 
pharmaceutical composition is in the form of a solid or a Uquid. 

60. The pharmaceutical composition of claim 59, wherein said 
pharmaceutical composition is in a capsule or tablet form. 

5 61. The pharmaceutical composition of claim 58 further comprising 

at least one nutrient selected fiom the group consisting of a vitamin, a mineral, a 
carbohydrate, a sugar, an amino acid, a free fetty acid, a phosphoUpid, an 
antioxidant, and a phenolic compound. 

62. A nutritional formula comprising said microbal oil or fraction 
1 0 thereof of claim 56. 

63. The nutritional formula of claim 62, wherein said nutritional 
formula is selected fiom the group consisting of an infant formula, a dietary 
supplonent, and a dietary substitute. 

64. The nutritional formula of claim 63, wherein said infant formula, 
1 5 dietary supplement or dietary supplement is in the form of a Uquid or a solid. 

65. An infant formula comprising said microbal oil or fraction 
thereof of claim 56. 

66. The infant formula of claim 65 further comprising at least one 
macronutrient selected from the group consisting of coconut oil, soy oil, canola 

20 oil, mono- and diglycerides, glucose, edible lactose, electrodialysed whey, 

electrodialysed skim nulk, milk whey, soy protein, and other protein 
hydrolysates. 

67. The infent formula of claim 66 further comprising at least one 
vitamin selected from the group consisting of Vitamins A, C, D, E, and B 

25 complex; and at least one mineral selected fix>m the group consisting of 

calcium, m^esium, zinc, manganese, sodium, potassium, phosphorus, copper, 
chloride, iodine, selenium, and iron. 

68. A dietary supplement comprising said microbal oil or fraction 
thereof of claim 56. 



-127- 



wo 98/46765 



PCT/US98/07422 



69. The dietary supplement of claim 68 further comprising at least 
one macronutrient selected from the group consisting of coconut oil. soy oil, 
canolaoiU mono- and diglycerides, glucose, edible lactose, electrodialysed 
whey, electrodialysed skim milk, milk whey, soy protein, and other protein 

S hydrolysates. 

70. The dietary supplemoit of claim 69 fiirther comprising at least 
one vitamin selected from the group consisting of Vitamins A, C, D, E, and B 
complex; and at least one mineral selected from the group consisting of 
calcium, magnesium, zinc, manganese, sodium, potassium, phosphorus, copper, 

10 chloride, iodine, selenium, and iron. 

7 1 . The dietary supplement of claim 68 or claim 70, wherein said 
dietary supplement is administered to a human or an animal. 

72. A dietary substitute comprising said microbal oil or fraction 
thereof of claim 56. 

J 5 73, The dietary substitute of claim 72 further comprising at least one 

macronutrient selected from the group consisting of coconut oil, soy oil, canola 
oil, mono- and diglycerides, glucose, edible lactose, electrodialysed whey, 
electrodialysed skim milk, milk whey, soy protein, and other protein 
hydrolysates. 

20 74. The dietary substitute of claim 73 fiirther comprising at least one 

vitamin selected from tiie group consisting of Vitamins A, C, D, E, and B 
complex; and at least one mineral selected from the groi^ consisting of 
calcium, magnesium, 2dnc, manganese, sodium, potassium, phosphorus, copper, 
diloride, iodine, selenium, and iron. 
25 75. The dietary substitute of claim 72 or claim 74, wherein said 

dietary substitute is administered to a human or animal. 

76. A method of treating a patient having a condition caused by 
insufRent intake or production of polyunsaturated fatty acids comprising 
administering to said patient said dietary substitute of claim 72 or said dietary 
30 supplement of claim 68 in an amount sufficient to effect said treatment. 
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10 



15 



20 



25 



77. The method of claim 72, wherein said dietary substitute or said 
dietary si^plement is administered entwaUy or parenterally. 

78. A cosmetic comprbing said microbal oil or fraction thereof of 



claim 56. 

79. 
topically. 

80. 

pharmaceutical composition is administered to a human or an animal. 

81 . An animal feed comprising said microbal oil or fraction thereof 



The cosmetic of claim 78, \»^ier«n said cosmetic is applied 



The pharmaceutical composition of claim 58, wherein said 



of claim 56 

82. 
species. 

83. 

alpina. 

84. 



The method of claim 54 wherein said fungus is Mortierella 
The method of claim 82 v^erein said fungus is Mortierella 



An isolated nucleotide sequence selected from the group 
consisting of SEQ ID NO:13 and SEQ ID NO: 15. 

85. An isolated nucleotide sequence from the group consisting of 
SEQ ID NO:7 and SEQ ID NO:19. 

86. An isolated nucleotide sequence comprising a nucleotide 
sequence selected from tiie groiq) consisting of: SEQ ID NO:13; SEQ ID NO: 
15; SEQ ID NO:17; SEQ ID NO:19; SEQ ID NO:21; SEQ ID NO:22; SEQ ID 
NO:23; SEQ ID N0.24; SEQ ID NO:25; SEQ ID NO:26 and SEQ ID NO:27. 

87. An isolated peptide sequence comprising a peptide sequence 
selected from tiie group consisting of: SEQ ID NO:14; SEQ ID NO:16; SEQID 
NO: 18; SEQ ID NO:20; SEQ ID NO:28; SEQ ID NO:29; SEQ ID NO:30; SEQ 
ID NO:3 1; SEQ ID NO:32; SEQ ID NO:33 and SEQ ID NO:34. 

88. Purified polypeptides produced from the nucleotide sequences of 
claims 84-86. 
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FIG. 4 

10 20 30 40 50 60 

LHHTYTNIAG ADPDVSTSEP DVRRIKPNQK WFVNHINQHM FVPFLYGLLA FKVRIQDINI 
70 80 90 100 110 120 

LYFVKTNDAI RVNPISTWHT VMFWGGKAFF VWYRLIVPLQ YLPLGKVLLL FTVADMVSSY 
130 140 150 160 170 180 

WLALTFOANY WEEVQWPLP DENGIIQKDW AAMQVETTQD YAHDSHLWTS ITGSLNYQXV 
HHLFPH 
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FastA Match of iiia29 and c ntig 2S3538a 



smRES Initl: 117 Initn: 225 Opt: 256 

l2S^llatenJn"core: 408; 27.0% Identity in 441 aa overlap 
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253538a 
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25353Ba 
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120 



130 



140 



150 



170 
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lao 190 200 210 220 
FSVTHHPTVmCILGATHDr FNCy^VLVIWYQIIMLGHHPYTNIAGAOP^ 

LivYilicPK-W^HL--viKrVIGHLKGASAIIW^ 

190 200 210 220 
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260 



270 



280 



!rovRRIKPNQl£F-VNHINQHMFV--PrLYGLUa^ 

WEIIQpiEYGil«LKYLpYiTO^ lOTMIVHKNWVDL 

230 240 250 260 270 280 
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..^ANIIVSYYI RFFITY ipF-YGILG-AlXFLNFIRFLESHWFVirrrQHNH^^ 
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400 410 420 430 440 

qhhypdilaiikntcseykvpylvkdtfwqafashlehlrvlglrpkeex 

RHHLHKIAPLVKSl^KHGIEYQEKPLLRALLDIIRSLKKSGKLWLDAYLHKX 
380 390 400 410 420 430 
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FastA Match of ma524 and contig 253538a 
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Tnltl- 231 Initn: 499 Opt: 401 

620; 27.3% identity in 455 aa overlap 

10 20 30 40 50 ^ 59 

aiiilAiJA- «MoaDTOvWi»j™is«rn^^ 

70 80 90 100 HO 

120 130 140 150 160 170 

GyYDSSKMflfMKVSPMUM^ 

^^jaO 130 140 150 ISO 

-I AO 190 200 210 220 230 

1:1:: | : : : | : : h j\ _U_U: __ JLVnl- J 

170^ 180 190 200 210 220 

250 260 270 280 290 

-HW-vica«pii»GKiaa«^^ ^ 

230 240 250 260 270 

MO 310 320 330 340 349 

KPSGIWWPISLVBOLSLAMHIfl^^ 

i ^wOraJwKVSVYOT 

280 290 300 310 320 



350 



360 



370 



pep 



380 390 400 409 

JJllvAEi--I«EU-DWrSSQLTAT^^ 
330 340 350 360 370 io« 




410 420 430 450 

390 400 410 420 430 



Flgortt 14 



INTERNATIONAL SEARCH REPORT 



Inu ional Application No 

PCT/US 98/07422 



tprr'Tf2Nf5'/5T^^ C12N5/10 C12P7/64 CllBl/00 

A61K31/20 A23L1/30 A23K1/00 

AccoKling to Intemattonai Patent ClasstficattondPC) or to both nailonai classification and IPC 



B. FIELDS SEARCHED 



Minimum documentation searched (olaaeiflcatlon system foUowed bV classillcatton symbols) 

IPC 6 C12N C12P CUB A61K A23L A23K 



Documentation searched other than minimumdocumortation to the extent that such documents are Included In the fields searched 



Electronic data base consulted during the international search (name of data base and. where practical, seansh terms used) 



C- DOCUMENTS CONSIDERED TO BE RELEVANT 



Category * 



Citation of document, with indication, where appropriate, of the relevant passages 



Relevant to daim No. 



COVELLO P. ET AL.: "Functional expression 

of the extraplastidial Arabidopsis 

thai i ana oleate desaturase gene {FAD2) in 

Saccharomyces cerevi s i ae" 

PLANT PHYSIOLOGY, 

vol. Ill, no. 1, May 1996, pages 223-226, 
XP002075211 

see the whole document 

WO 93 06712 A (RHONE POULENG AGROCHIMIE) 

15 April 1993 

cited In the application 

see the whole document 

WO 94 18337 A (MONSANTO CO ;UNIV MICHIGAN 
(US); GIBSON SUSAN IRMA (US); KISHORE 6) 
18 August 1994 
* see the whole document * 



1-83 



1-83 



1-83 



m 



Further documents are listed In the continuation of box C. 



Patent famity memkyers are listed In annex. 



o Special categories of cited documents : 

"A" document defining the general state of the art which is not 

considered to be of particular relevance 
"E" earlier document but published on or after the International 

filing date 

"L" document which may throw doubts on priority cialm(s) or 
which is cited to establish the publtcationdate of another 
citation or other special reason (as specified) 

"O* document referring to an oral disclosure, use, exhibition or 
other means 

"P" document published prior to the international filing date but 
later than the priority date claimed 



T" later document published atter the international filing date 
or priority date and not in contllct with the application but 
cited to understand the principle or theory underlying the 
invention 

"X" document of particular relevance; the claimed invention 
carvTOt be considered novel or cannot be considered to 
involve an Inventive step when the document Is tai<en alone 

"Y* document of particular relevance; the claimed invention 

cannot be consldored to Involve an inventive step wtwn the 
document is combined with one or more other such docu- 
ments, such combination being obvious to a person sidiled 
In the art 

document member of the same patent family 



Date of the actual completion of thelnternational search 



7 September 1998 



Date of mailing of the International search report 

21/09/1998 



Name and mailing address of the ISA 

European Patent Office. P.B. 5818 Patentlaan2 
NL - 2280 HV Rl]8wl|k 
Tel. (+31-70) 340-2040. Tx. 31 651 epo nl. 
Fax: (+31-70) 340-3016 



Authorized officer 



Kania, T 



Foim PCT/lSA/21 0 (seoond sheet) (July 1992) 



page 1 of 3 



INTERNATIONAL SEARCH REPORT 


Inte ional Application No 

PCT/US 98/07422 


■ C.(Cominuatlon) DOCUMENTS CONSIDERED TO BE RELEVANT r- 




Category * 


CHatlon of doeument, wUh IrwUcatlon.whorB approprtoM, ol the relawant passages 




A 


WO 96 21022 A (RHONE POULENC AGROCHIHIE) 
11 July 1996 

cited in the application 
* see the whole document * 




1-83 


A 


EP 0 561 569 A (LUBRIZOL CORP) 22 

September 1993 

cited In the application 

see the whole document 




1-83 


A 


SPYCHALLA J. ET AL.: "Identification of 
an animal w3 fatty acid desaturase by 
heterologous expression 1n Arabidopsis" 
PNAS,U.S.A, , 

vol. 94, no. 4, 18 February 1997, pages 
1142-1147, XP002076628 
* see esp. discussion 




1-83 


A 


HILLIER L. ET AL.: "The WashU-Herck EST 

Project, AC W49761" 

EMBL DATABASE, 

30 May 1996, XP002076629 

Heidelberg 

* corresponding to SEQ ID NO: 21 * 
see the whole document 


• 


86 


A 


NATHANS J.: "Adult human retina cDNA, AC 
W28140" 

EMBL DATABASE, 

14 May 1996, XP002076630 

Heidelberg 

* corresponding to SEQ ID NO: 22 * 
see the whole document 




86 


A 


HILLIER L. ET AL.: "The WashU-Merck EST 

Project, AC W67716" 

EMBL DATABASE, 

16 June 1996, XP002076631 

Heidelberg 

* corresponding to SEQ ID NO: 24 * 
see the whole document 




86 


A 


HILLIER L. ET AL, : "The WashU-Merck EST 

Project, AC H17219" 

EMBL DATABSE, 

1 July 1995, XP002076632 

Heidelberg 

* corresponding to SEQ ID NO: 25,27 * 
see the whole document 




86 



Fbim PCT/ISA/210 (condnuatloii of second sheet) (Jul/ 1982) 



page 2 of 3 



INTERNATIONAL SEARCH REPORT 


Inti Uonal Applieatlon No 

PCT/US 98/07422 


Category * 


Oltatlon of document, with lndloatloi),where appropriate, ol the relevant pasaagee f 


telavant to claim No. 


A 

P.X 
P.X 
T 


HILLIER L. ET AL. : "The WashU-Merck EST 

Project, AC H19385" 

EMBL DATABASE, 

7 July 1995, XP002076633 

Heidelberg 

* corresponding to SEQ ID NO: 26 * 
see the whole document 

YOSHINOR. ET AL.: "AC C25549" 

EMBL DATABASE, 

24 July 1997. XP002076634 

Heidelberg 

* corresponding to SEQ ID NO: 13 * 
see the whole document 

CADENA D. ET AL.: "AC AF002668" 

EMBL DATABASE, 

4 July 1997, XPC02076635 

Heidelberg 

* corresponding to SEQ ID NO: 21 * 
see the whole document 

MICHAELSON L. ET AL. : "Isolation Of a 

deltaB-fatty acid desaturase gene from 

Mortlerella alplna" 

JOURNAL OF BIOLOGICAL CHEMISTRY, 

vol. 273, no. 30, 24 July 1998, pages 

19055-19059, XP002076636 

see the whole document 


86 

84,86-88 

86-88 

1-83 



Fonn PCT/ISAAIO <continuatlon of second sheet) (July 19^2) 



page 3 of 3 



INTERNATIONAL SEARCH REPORT 



L .^maiional application No. 
PCT/US 98/07422 



B « I Ob rvattona wh r certain claiwa were found unsearchable (C ntinuatlonof itam 1 f flrst sheet) 



Tills 



international Search Report has not been estabttshed In respect of certain claims under Article 1 7(2)(a) tor the toiiowing reasons: 



because Sw^ relate to subject matter not required to be searched by this Authority, namely: 

Remark: Although claims 57, 76, 77 ^ ^ 4.u w y.«^«oi 

are directed to a method of treatment of the human/am ma i 

body, the search has been carried out and based on the alleged 
effects of the compound/composition. 

^* ^ SSaise they relate to parts of the international Application that do not comply with the prescribed requirements tosuch 
an extent that no meaningful International Search can be carried out. specifically: 



^* ^ be^se are dependent claims and are not drafted In accordance with the second and third sentences of Rule6.4(a). 



Box II Observations where unity of Invention is lacking (Continuation of Item 2 of first sheet) 



This international Soarching Authority found multtple Inventions in this international application, as follows: 

see additional sheet 



1 . I — I AS all required additional search fees were timely paid by the appBcam, this international Search Report ewers all 
' — » searchable claims. 



2. [X] AS all searchable claims could be searched without effort justifying an additional fee. this Authority did not Invltepayment 
— of any additional fee. 

3 I — 1 AS only some of the required additional search fees were timely paid by the applicant, this International Search Report 
' I — I covers only those claims for which fees were paid. specifically claims Nos.: 



4 I I No required additional search fees were timely paid by the applicant Consequently, this International Search Report is 
' ' — ' restricted to the invention first mentioned in thedaims: it is covered by clairns Nos.: 



Remark on Protest Q The additional search fees were accompanied by the appUcanfs protest. 

I I No protest accompanied the payment of additional search fees. 



R>rm PCT/ISA/210 (continuation of first sheet (1))(July 1992) 



INTERNATIONAL SEARCH REPORT 



international Application No. PCT/ US 98 / 07422 



FURTHER INFORMATION CONTINUED FROM PCT/ISA/ 210 



This International Searching Authority found multiple (groups of) 
Inventions in this international application, as follows: 

1, Claims: 1-83 

Nucleic acids, polypeptides, constructs comprising delta-5 
desaturase according to SEQ ID NO: 1,2 derived from the 
fungus Mortierella alpina. Recombinant host cells comprising 
said nucleic acids or constructs. . 
Methods for the production of arachidonic acid in a 
microbial cell comprising a cell containing a vector 
encoding an enzyme activity which converts 
dihomo-y-linolenic acid to arachidonic acid, preferentially 
a delta-5 desaturase, more preferentially from Mortierella. 
A recombinant yeast cell converting more than 5% of a 20:3 
fatty acid to a 20:4 fatty acid. 

Methods for desaturating dihomo-y-1 inolemc acid using said 
microbial cells, fatty acids and oils obtained thereby. 
Methods for obtaining altered long chain polyunsaturated 
fatty acid biosynthesis using microbes comprising delta-b 
desaturase derived from fungi or algae. 
Microbial oils derived from thereof and their use for 
therapeutical, nutritional, and cosmetical purposes, as well 
as products derived therefrom. 
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An isolated sequence comprising the sequence of SEQ ID NO: 
13 purified polypeptides produced thereof, esp. comprising 
SE6 id NO: 14. 



3. Claims: 84, 86-88 partially . x ccn in ti(\. 

An Isolated nucleotide comprising the sequence of SEQ ID NO. 
15, purified polypeptides produced thereof, esp. comprising 
SEQ ID NO: 16. 

4. Claims: 85 completely, 86-88 partially 

An isolated nucleotide sequence consisting of SEQ ID NO: 17 
and SEQ ID NO: 19, purified polypeptides produced therefrom, 
esp. comprising SEQ ID NO: 18 and SEQ ID NO: 20. 
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@ Method for the expression of genes in plants. 

@ A method for the expression of genes in plants, parts of 
plants, and plant cell cultures, In which a DNA fragment Is 
used comprising an inducible plant promoter of root nodute- 
speclflc genes. DNA-fragmonts comprising an inducible 
plant promoter, to be used when carrying out the method, 
said DNA-fragments being identical with, derived from or 
comprising a 5* flanldng region of root nodulCTSp^ciflG 
genes of any origin as well as plasrriids and transformed 
Agrobacterium rhizogenes-strain which can be used when 
carrying out the method. 
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A method for the exp ression Qf genes in plants. 
p«ftfi of Plan t s- and olant cell cultuiTeg . and BM 
ft;« f rinAnt:g. plasmlds . and transformed micyooygatiisms 
fn be used when car rying out the method, as well 
5 as the use thereof for the expression of genes In 
p1^nt«. nairts of Plants, and plftUt c^ll cultures. 

The invention relates to a novel method for the 
expression of genes in plants, parts of plants, 
and plant cell cultures, as well as DNA fragments 
10 and plasmids comprising said DNA fragments to be 
used when carrying out the method. The invention 
furthermore relates to transformed plants, parts 
of plants and plant cells. 

The invention relates to this method for the ex- 
15 pression of genes of any origin under control of 
an inducible, root nodule specific promoter. 

The invention relates especially to this method 
for the expression of root nodule - specif ic genes 
in transformed plants including both leguminous 
20 plants and other plants. 

The invention relates furthermore to DNA fragments 
comprising an inducible plant promoter to be used 
when carrying out the method, as well as plasmids 
comprising said DNA fragments. 

25 In the specification i.a. the following terms are 
used: 

Root nodule-specific genes: Plant genes active 
only in the root nodules of leguminous plants, or 
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genes witti an increased expression in root nodules • 
Root: nodule*speci£lc plant: genes are expressed at: 
predetermined stages of development and are ac- 
tivated in a coordinated manner during the symbiosis 
5 whereby a nitrogen fixation takes place and the 
fixed nitrogen is utilized in the metabolism of 
the plant. 

Inducible plant promoter: Generally is meant a 
promoter- active 5' flanking region from plant genes 

10 inducible from a low activity to a high activity. 
In relation to the present invention "inducible 
plant promoter" means a promoter, derived from, 
contained in or being identical with a 5' flanking 
region including a leader sequence of root nodule- 

15 specific genes and being capable of promoting and 
regulating the expression of a gene as characterised 
in relation to the present invention. 

Leader sequence: Generally is meant a DNA sequence 
being transcribed into a mRNA, but not further 

20 t:ranslated into protein. The leader sequence com* 
prises thus the DNA fragment from the start of the 
transcription to the ATG codon constituting the 
start of the translation. In relation to the present 
invention "leader sequence" means a short DNA frag- 

25 ment contained in the above Inducible plant promoter 
and typically comprising 40-70 bp and which may 
comprise sequences being targets .for a posttran- 
scriptional regulation. 

Promoter region: A DNA fragment containing a pro- 
30moter which comprises target sequences for RNA 
polymerase as well as possible activation regions 
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comprising target sequences for transcriptional 
effector substances. In the present invention, 
target sequences for transcriptional effectors may 
also be situated 3' to the promoter, i.e. in the 
Seeding sequences, the intervening sequences or on 
the 3' flanking region of a root nodule -specif ic 
gene . 

Furthermore a number of molecular-biological terms 
generally known to persons skilled in the art are 
lO used, including the terms stated below: 

CAP (addit:lon^ site: The nucleotide of the 5' end 
of the transcript where 7-methylGTP is added; In 
the Figures often given also as an asterisk *-marked 
nucleotide on a given nucleotide sequence. 

DNA saauftTtcft or HNA segment; A linear array of 
nucleotides interconnected through phosphodies ter 
bonds between the 3' and 5' carbon atoms of adjacent 
pentoses . 

p!x:pression: The process undergone by a structural 
20 gene to produce a polypeptide. It is a combination 
of transcription and translation as well as possible 
posttranslational modifications . 

Flanking TAplons: DNA sequences surrounding coding 
regions. 5' flanking regions contain a promoter. 
25 3' flanking regions may contain a transcriptional 
terminator etc. 

Gene : A DNA sequence composed of three or four 
parts, viz. (1) the coding sequence for the gene 
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product:, (2) t:he sequences 1-n t:he prcmoter region 
which cont:rol whether or not the gene will be ex- 
pressed, (3) those sequences In the 3' end con- 
ditioning the transcriptional termination and op- 
5 tlonally polyadenylatlon, as well as (4) Interven- 
ing sequences. If any. 

Intervening sequences : DNA sequences within a gene 
which are not coding for any peptide fragment. The 
Intervening sequences are transcribed Into pre-mRKA 
IjO and are eliminated by modification of pre-mRNA 
into mRNA. They are also called Introns. 

Chimeric gene: A gene composed of parts from various 
genes. E.g. the chimeric Lbc3 - 5 ' - 3 ^ - CAT is composed 
of a chloroamphenlcolacetyltransf erase - coding se- 
15 quence deriving from E. coll and 5' and 3' flanking 
regulatory regions of the Lbc3 gene of soybean. 

Cloning : The process of obtaining a population of 
organisms or DNA sequences deriving from one such 
organism or sequence by asexual reproduction, or 
20more particular a process of isolating a particular 
organism or part thereof, and the propagation of 
this subfractlon as a homogeneous population. 

Coding sequences: DNA sequences determining the 
amino acid sequence of a polypeptide. 

3'^ Cross - inoculation group: A group of leguminous 
plant species capable of producing functionally 
active root nodules with Rhizoblum bacteria Isolated 
from root nodules of other species of the group. 
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Le ghemoglobin (Lb^ : An oxygen-binding protein ex - 
clusively synthesized in root nodules. The Lb pro- 
teins regulate the oxygen partial pressure in the 
root nodule tissue and transport oxygen to the 
5 bacteroxdes. In this manner the oxygen- sens itlve 
nitrogenase enzyme is protected* The Lb genes are 
root nodule*apeci£ic genes. 

Messenger-RNA (mRNA^ : RN^A molecule produced by tran- 
scription of a gene and possibly modification of 
10 mRNA. The mRNA molecule mediates the genetic message 
determining the amino acid sequence of a polypeptide 
by part of the mRNA molecule being translated Into 
said peptide. 

Downstream: A position in a. DNA sequence. It is 
15 defined relative to the transcriptional direction 
5'- 3' of the gene relative to which the position 
is stated. The 3' flanking region is thus posi- 
tioned downstream of the gene. 

Nucleotide : A monomeric unit of DNA or RNA con- 
20 sisting of a sugar moiety (pentose) , a phosphate, 
and a nitrogeneous heterocyclic base. The base is 
linked to the sugar moiety via a glycosidic bond 
(1' carbon of the pentose), and this combination 
of base and sugar is a nucleoside. The base cha- 
25 racterises the nucleotide. The four DNA bases are 
adenine (A), guanine (G) , cytosine (C), and thymine 
(T) . The four RNA bases are A, G, C, and uracil (U) . 

Uns tream: A position in a DNA sequence. It is de- 
fined relative to the transcriptional direction 
30 5 ' - 3' f the gene relative to which the position 
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Is seated. Tlie 5' flanking region Is tihus poslt:loned 
upstream of this gene. 

Plant tirans formation: Processes leading to incor- 
poration of genes in the genome of plant cells in 
5 such a manner that these genes are reliably in- 
herited through mitosis and meiosis or in such a 
manner that these genes are only maintained for 
short periods. 

Plasmid: An extra -chromosomal double -stranded DNA 
lO sequence comprising an intact replicon such that 
the plasmid is replicated in a host cell. When the 
plasmid is placed within a unicellular organism, 
the characteristics of that organism are changed 
or transformed as a result of the DNA of the plas- 
15 mid. For instance a plasmid carrying the gene for 
tetracycline resistance (Tc^) transforms a cell 
previously sensitive to tetracycline into one which 
is resistant to it. A cell transformed by a plasmid 
is called a transf ormant . 

20 Polypeptide : A linear array of amino acids inter- 
connected by means of peptide bonds between the 
oc-amino and carboxy groups of adjacent amino acids. 

Recombination: The creation of a new DNA molecule 
by combining DNA fragments of different origin. 

25 Homologous recombination: A recombination between 
sequences showing a high degree of homology. 

Replication : A process reproducing DNA molecules. 
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Replicon: A self - replicating genetic element pos- 
sessing an origin for the initiation of DNA re- 
plication and genes specifying the functions neces- 
sary for a control and a replication thereof. 

5 Restriction fragment: A DNA fragment resulting from 
double-stranded cleavage by an enzyme recognizing 
a specific target DNA sequence. 

RNA polymerase: Enzyme effecting the transcription 
of DNA into RNA. 

Root nodule: Specialized tissue resulting from 
infection of mainly roots of leguminous plants 
with Rhlzobium bacteria. The tissue is produced by 
the host plant and comprises therefore plant cells 
whereas the Rhlzobium bacteria upon infection are 

jj^ surrounded by a plant cell membrane and differen- 
tiate into bacteroides. Root nodules are produced 
on other species of plants upon infection of nitro- 
gen-fixing bacteria not belonging to the Rhilffobium 
genus. Root nodule- specif ic plant genes are also 

20 expressed in these nodules. 

Southern-hvbrldlzatlon: Denatured DNA is transferred 
upon size separation in agarose gel to a nitro- 
cellulose membrane. Transferred DNA is analysed 
for a predetermined DNA sequence or a predetermined 

25 gene by hybridization. This process allows a binding 
of single-stranded, radioactively marked DNA se- 
quences (probes) to complementary single - stranded 
DNA sequences bound on the membrane. The position 
of DNA fragments on the membrane binding the probe 

30 can subsequently be detected n an X-ray film. 
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Symbiotic tiltrogeti flxatrloTi; TH^ relatlonslilp vhere- 
b7 bacteroldes of root nodules convert the nitrogen 
(dlnltrogen) of the air Into ammonium utilized hy 
the plant while the plant provides the bacteroldes 
5 with carbon compounds as a carbon source. 

$yqb3,ont ; One part of a symbiotic relationship, and 
especially Rhlzoblum is called the mlcrosymbiont . 

Transformation ! The process whereby a cell is incor- 
porating a DNA molecule. 

lO Translation: The process of producing a polypeptide 
from mRNA or: 

the process whereby the genetic information present 
in a mRNA molecule directs the order of specific 
amino acids during the synthesis of a polypeptide. 

15 Transcription: The method of synthesizing a com- 
plementary RNA sequence from a DNA sequence. 

Vector : A plasmid, phage DNA or other DNA sequences 
capable of replication in a host cell and having 
one or a small number of endonuclease recognition 
20 sites at which such DNA sequences may be cleaved 
in a determinable manner without loss of an es- 
sential biological function. 

Traditional plant breeding is based on repeated 
crossbreeding of plant lines individually carrying 
25 desired qualities. The identification of progeny 
lines carrying all the desired qualities is a par- 
ticularly time-consuming process as the biochemical 
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and genetic basis of the qualities is usually un- 
known. New lines are therefore chosen according to 
their phenotype, usually after a screening of many 
lines in field experiments. 

5 Through the ages a direct connection has existed 
' between the state of nutrition, i.e. the health, 
of the population and the agricultural possibility 
of ensuring a sufficient supply of assimilable 
nitrogen in order to obtain satisfactory yields. 

10 Already in the seventeenth century it was discovered 
that plants of t:he family leguminosae including 
beyond peas also beans» lupins, soybean, bird's-foot 
trefoil, vetches, alfalfa, sainfoin, and trefoil had 
an ability of improving crops grown on the habitat 

15 of these plants. Today it is known that the latter 
is due to the fact that the members of the plants 
of the family leguminosae are able^to produce nitro- 
gen reserves themselves. On the roots they carry 
bacteria with which they live in symbiosis. 

20 An infection of the roots of these leguminous plants 
with Rhlzobium bacteria causes a formation of root 
nodules able to convert atmospheric nitrogen into 
bound nitrogen, which is a process called nitrogen 
fixation. 

25 Atmospheric nitrogen is thereby converted into forms 
which can be utilized by the host plant as well as 
by the plants later on growing on the same habitat. 

In the nineteenth century the above possibility was 
utilized f r the supply of nitrogen in order to 
30 achieve a novel increase of the crop yield. 
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The later further Increases In the yield have, how- 
ever, especially been obtained by means of natural 
fertilizers and nitrogen- containing synthetic fer- 
tilizers. The resulting pollution of the environ- 
5 ment makes it desirable to provide alternative 
possibilities of ensuring the supply of nitrogen 
necessary for the best possible yields obtainable. 

It would thus be valuable to make an improvement 
pos,sible of the existing nitrogen fixation systems 
lO in leguminous plants as veil as to allow an in- 
corporation of nitrogen fixation systems in other 
plants . 

The recombinant DNA technique and the plant trans- 
formation systems developed render it now possible 

15 to provide plants with new qualities In a well- 
controlled manner. These characteristics can derive 
from not only the same plant species, but also 
from all other prokaryotic or eukaryotic organisms. 
The DNA techniques allow further a quick and spe- 

20 cif ic Identification of progeny lines carrying the 
desired qualities. In this manner a specific plant 
line can be provided with one or more desired qual- 
ities in a quick and well-defined manner. 

Correspondingly, plant cells can be provided with 
25 well defined qualities and subsequently be main- 
tained as plant cell lines by means of known tissue 
culture methods. Such plant cells can be utilized 
for the production of chemical and biological prod- 
ucts of particular interest such as dyes, flavours, 
30 aroma components, plant hormones, pharmaceutical 
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products, primary and secondary metabolites as 
well as polypeptides (enzymes) . 

A range of factors and functions necessary for 
biological production of a predetermined gene pro- 
5 duct are known. Both the initiation and regulation 
of transcription as well as the initiation and 
regulation of pos ttransscriptional processes can 
be characterised . 

At the gene level it is known that these functions 
10 are mainly carried out by 5' flanking regions. A 
wide range of 5' flanking regions from prokaryotic 
and eukaryotic genes has been sequenced, and in 
view inter alia thereof a comprehensive knowledge 
has been provided of the regulation of gene ex- 
15 pression and of the sub-regions and sequences being 
of importance for the regulation of expression of 
the gene. Great differences exist in the regulatory 
mechanism of prokaryotic and eukaryotic organisms, 
but many common features apply to the two groups, 

20 The regulation of the expression of gene may take 
place on the trans s crip tional level and is then 
preferably exerted by regulating the initiation 
frequency of transscr iption . The latter is well- 
known and described inter alia by Benjamin Lewin, 

25 Gene Expression, John Wiley & Sons, vol. I, 1974, 
vol. II, Second Edition 1980, vol. Ill, 1977. As 
an alternative the regulation may be exerted at 
the pos ttransscriptional level, e.g. by the re- 
gulation of the frequency of the translation ini- 

30 tiation, at the rate of the translation, and of 
the termination of the translation. 
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The present: lnvent:lon Is based on Che surprising 
finding that 5' flanking regions of root nodule- 
specific genes, exemplified by the 5' flanking 
region of the soybean leghemoglobin Lbc3 gene, can 
5 be used for inducible expression of a foreign gene 
in an alien leguminous plant. The induction and 
regulation of the promoter is preferably carried 
out in the form of a regulation and induction at 
the transscriptional level and differs thereby 
lO from the inducability stated in Patent Application 
No. 8 6114704.9^ the latter inducability preferably 
being carried out at the translation level. 

The transscription of both the Lbc3 gene of the 
soybean and of a chimeric I*bc3 gene transferred to 

15 bird's -foot trefoil starts at a low level immediate- 
ly upon the appearance of the root nodules on the 
plant roots. Subsequently, a high increase of the 
transscription takes place immediately before the 
root nodules turn red. The transcription of a range 

20 of other root nodule - specif ic genes is initiated 
exactly at this time. The simultaneous induction 
of the transscription of the Lb genes and other 
root nodule- specif ic genes means that a common DNA 
sequence (s) must be present for the various genes 

25 controlling this pattern of expression. Thus the 
leghemoglobin- C3 gene is a representative of one 
class of genes and the promoter and the leader 
sequence, target areas for activation as well as 
the control elements of the organ specificity of 

30 L^^s gene are representatives of the control 

elements of a complete gene class. 
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The promoter of the 5' flanking regions of the Lb 
genes functions In soybeans and Is responsible for 
the transcription of the Lb genes in root nodules. 
It is furthermore known, that the efficiency of 
5 both the transcription initiation and the subsequent 
translation initation on the leader sequence of the 
Lb genes is high as the Lb proteins constitute ap- 
proximately 20% of the total protein content in 
root nodules • 

lO The sequence of 5' flanking regions of the four 
soybean leghemoglobin genes Lba, Lbc]^, Lbc2 1 and 
Lbc3 appears from the enclosed sequence scheme, 
scheme 1, wherein the sequences are stated in such 
«a manner that the homology between the four 5' 

15 flanking regions appears clearly. 

In the sequence scheme indicates that no base 

is present in the position in question. The names 
of the genes and the base position counted upstream 
from the ATG start codon are indicated to the right 
20 the sequence scheme. Furthermore the important 
sequences have been underlined. 

As it appears from the sequence scheme a distinct 
degree of homology exists between the four 5' flan- 
king regions, and in the position 23-24 bp upstream 

25 from the CAP addition site they all contain a 
TATATAAA sequence corresponding to the "TATA" box 
which in eukaryotic cells usually are located a 
corresponding number of bp upstream from the CAP 
addition site. Furthermore a CCAAG sequence is 

30 present 64-72 bp upstream from the CAP addition 
site, said sequence corresponding to the "CCAAT" 
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box usually located 70-90 bp upstream from the CAP 
addition site. From the CAP addition site to the 
translation start codon, AX6, leader sequences, of 
52-59 bp are present and show a distinct degree of 
5 homology of approx. 75-80%, 

In accordance with the present invention it has 
furthermore been proved, exemplified by the Lbc3 
gene, that the 5' flanking regions of the soybean 
leghemoglobln genes are functionally active in 

10 other plant species. The latter has been proved by 
fusionlng the E, coll chloroamphenlcol acetyl trans- 
ferase (CAT) gene with the 5' and 3' flanking re- 
gions of the soybean Lbc3 gene in such a manner 
that the expression of the CAT gene is controlled 

15 by the Lb promoter. This fusion fragment was cloned 
into the integration vectors pARl and pAR22, where- 
by the plasmids pAR29 and pAR30 were produced. 
Through homologous recombination the latter plasmids 
were integrated into the A^robacterium rhizoeenes 

20 T DNA region. The transformation of Lotus cornicu- 
latus (bird's-foot trefoil) plants, i.e. transfer 
of the T DNA region, was obtained by wound infection 
on the hypokotyl. Roots developed from the trans- 
formed plant cells were cultivated in vitro and 

25 freed from A. rhizogenes bacteria by means of anti- 
biotics . Completely regenerated plants were produced 
by these root cultures in a conventional manner 
through somatic embryogenesls or organogenesis. 

Regenerated plants were subsequently inoculated 
30 with Rhlzobium lot! bacteria and root nodules for 
analysis were harvested. Transcription and trans- 
lation of the chimeric Lbc3 CAT gene could subse- 
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quently be detected in root nodules on transformed 
plants as the activity of the produced chloroam- 
phenlcol acetyl transferase enzyme. 

The conclusion can subsequently be made that the 
5 promoter-containing 5' flanking regions of root 
nodule-specific genes exemplified by the soybean 
Lbc3 promoter are functionally active in foreign 
plants. The latter is a surprising observation as 
root nodules are only developed as a consequence 
lOof a very specific interaction between the legu- 
minous plant and its corresponding Rhizobium micro- 
symbiont . 

Soybeans produce nodules only upon infection by the 
species Rhizobium iaponlcum and ]Lotus cornicuj^^t^^ 

15 only upon infection by the species ]^hj[,aQbium l^fl.- 
Soybean and Lotus norTtlGulatue belong therefore to 
two different cross - inoculation groups, each group 
producing root nodules by means of two different 
Rhizobium species. The expression of a chimeric 

20 soybean gene in Lotus cornlcu latus proves therefore 
an unexpected universal regulatory system applying 
to the expression of root nodule-specific genes. 
The regulatory DNA sequences involved can be placed 
on the 5' and 3' flanking regions of the genes, 

25 here exemplified by the 2.0 Kb 5' and 0.9 Kb 3' 
flanking regions of the LbC3 gene. This surprising 
observation allows the use of root nodule - specif ic 
promoters and regulatory sequences in any other 
plant species and any other plant cell line. 

30 In other experiments the 5' flanking region of the 
nodule -specific N23 gene was fused to the CAT gene 
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and the Lbc3 3' flanking region in such a manner 
that the expression of the CAT gene is controlled 
by the N23 promoter. This fusion fragment was cloned 
into the integration vector pAR22 producing the 
5 plasmid N23-CAT which was subsequently recomblned 
into A.rhjLzQgenes and transferred to Lotus corni- 
culatus and Trifolium renens (white clover) by the 
previously described method. The root nodule-spe- 
cific expression of the transferred N23-CAT gene 

10 obtained in L. corniculatus infected with Rhizoblum 
loti and in T . repens infected with Rhizoblum trl- 
f olll further demonstrated that expression of root 
nodule- specif Ic genes is Independent of the plant 
species and Rhizoblum species. A universal regu- 

15 latory system therefore regulates the expression 
of root nodule-specific genes in the different 
symbiotic systems formed between legumes and the 
Rhizoblum species of the various cross - inoculation 
groups . 

20lt is known from European Patent Application EF 
122, 791. Al that plant genes from one species, by 
A^roba cterlum mediated transformation, can be trans- 
ferred into a different plant species. It is also 
known from EP 122, 791. Al that a transferred gene 

25encodlng the seed storage protein "Phaseolln" can 
be expressed into tobacco and alfalfa. From the 
literature it is also known that this expression 
is seed specific (Sengup ta- Gopalan et al. 1985, 
Proc. Natl. Acad. Scl. 82. 33203324). 

30The present invention therefore relates to a novel 
method for the expression of transferred genes in 
a root nodule- specific manner . using DNA regulatory. 
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sequences from the 5' promoter region, the coding 
region, or the 3' flanking region of root nodule- 
specific genes, here exemplified by the leghemoglo- 
bin LbC3 gene and the N23 gene. This method is 
5 distinct from both the method of A^yobac t^y ium 
mediated transformation and expression of the se^d 
storage protein phaseolln gene characterised in EP 
122, 791. Al. Expression of the transferred phaseolin 
gene in EP 122,7&1.A1 only demonstrates that the 

10 phaseolin gene family with its particular regulatory 
requirements can be expressed in tobacco and alfal- 
fa. It does not demonstrate nor predict that anx 
other genes with their particular regulatory re- 
quirements can be expressed in any other plants or 

15 plant tissue. 

An object of the present invention is to provide a 
possibility of expressing desired genes in plants, 
parts of plants, and plant cell cultures. 

A further object of the invention is to render it 
20 possible to express genes of any origin by the 
control of an inducible root nodule - specif ic pro- 
moter . 

A particular object of the invention is to provide 
a possibility of expressing desired genes in legu- 
25 minous plants . 

A still further particular object of the invention 
is to provide a possibility of expressing root 
nodule -specific genes in non- leguminous plants. 

Further objects of the invention are to improve the 
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existing nitrogen- fixing systems in leguminous 
plants as well, as to Incorporate nitrogen- fixing 
systems in other plants. 

A further ohject of the invention is to provide a 
5 possibility of in certain cases allowing the use 
of specific sequences of the 3' flanking region, 
of the coding sequence, and of intervening sequences 
to influence the regulation of the root nodule- 
specific promoter. 

10 Furthermore it is an object of the invention to 
provide plasmids comprising the above mentioned 
inducible plant promoter. 

Further objects of the invention appear immediately 
from the following description. 

15 The method according to the invention for the ex- 
pression of genes in plants, parts, of plants, and 
plant cell cultures is carried out by introducing 
into a cell thereof a recombinant DNA segment con- 
taining both the gene to be expressed and a 5' 

20 flanking region comprising a promoter sequence, and 
optionally a 3' flanking region, and culturing of 
the transformed cells in a growth medium, said 
method being characterised by using as the recom- 
binant DNA segment a DNA fragment comprising an 

25 inducible plant promoter (as defined) from root 
nodule-specific genes. If desired the transformed 
cells are regenerated to plants. 

The method according to the invention allows in a 
well defined manner an expression f foreign genes 
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in plants, parts of plants, and plant cell cultures, 
in this connection especially genes providing the 
plants with desired properties such as for instance 
a resistance to plant diseases and increased content 
5 of valuable polypeptides. 

A further use is the preparation of valuable pro- 
ducts such as for instance dyes, flavourings, plant 
hormones, pharmaceutical products, primary and 
secondary metabolites, and polypeptides by means 
]_0 of the method according to the invention in plant 
cell cultures and plants^ 

By using the method according to the invention for 
the expression of root nodule- specif ic genes it is 
possible to express root nodule - specif ic genes 

•^5 necessary for the formation of an active nitrogen- 
fixing system both in leguminous plants and other 
plants. The correct developmental control, cf. 
Example 8, allows the establishment of a symbiotic 
nitrogen- fixing system in non- leguminous plants. In 

20 this manner it is surprisingly possible to improve 
the existing nitrogen- fixing systems in leguminous 
plants as well as to incorporate nitrogen- fixing 
systems in other plants. 

The use of the method according to the invention 
25 for the expression of foreign genes in root nodules 
renders it possible to provide leguminous plants 
with improved properties such as resistance to 
herbicides and resistance to diseases and pest. 

According to a particular embodiment of the method 
30according to the invention a DNA fragm nt is used 
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which comprises an Inducible plant: promoter and 
which is identrical with, derived from, or comprises 
5^ flanking regions of leghembglobins genes. In t:his 
manner the expression of* any gene is obtained. 

5 Examples of such DNA fragments are DNA fragments 
of the four 5' flanking regions of the soybean 
leghemoglobin genes » viz. 



I«ba with the sequence: 



10 GAGATACATT ATAATAATCT CTCTAGTGTC TATTTATTAT TTTATCTGGT 
GATATATACC TTCTCGTATA CTGTTATTTT TTCAATCTTG TAGATTTACT 
TCTTTTATTT TTATAAAAAA GACTTTATTT TTTTAAAAAA AATAAAGTGA 
ATTTTGAAAA CATGCTCTTT GACAATTTTC TGTTTCCTTT TTCATCATTG 
GGTTAAATCT CATAGTGCCT CTATTCAATA ATTTGGGCTC AATTTAATTA 
GTAGAGTCTA CATAAAATTT ACCTTAATAG TAGAGAATAG AGAGTCTTGG 
AAAGTTGGTT TTTCTCGAGG AAGAAAGGAA ATGTTAAAAA CTGTGATATT 

15 TTTTTTTTGG ATTAATAGTT ATGTTTATAT GAAAACTGAA AATAAATAAA 
CTAACCATAT TAAATTTAGA ACAACACTTC AATTATTTTT TTAATTTGAT 
TAATTAAAAA ATTATTTGAT TAAATTTTTT AAAAGATCGT TGT.TTCTTCT 
TCATCATGCT GATTGACACC CTCCACAAGC CAAGAGAAAC ACATAAGCTT 
TGGTTTTCTC ACTCTCCAAG CCCT CTATAT AAACAAATAT TGGAGTGAAG 
TTGTTGCATA ACTTGCATCG AACAATTAAT AGAAATAACA GAAAATTAAA 
AAAGAAATAT G, 



2o Lbci with the sequence: 



TTCTCTTAAT ACAATGGAGT TTTTGTTGAA CATACATACA TTTAAAAAAA 
AATCTCTAGT GTCTATTTAC CCGGTGAGAA GCCTTCTCGT GTTTTACACA 
CTTTAATATT ATTATATCCT CAACCCCACA. AAAAAGAATA CTGTTATATC 
TTTCCAAACC TGTAGATTTA TTTATTTATT TATTTATTTT TACAAAGGAG 
ACTTCAGAAA AGTAATTACA TAAAGATAGT GAACATCATT TTATTTATTA 
TAATAAACTT TAAAATCAAA CTTTTTTATA TTTTTTGTTA CCCTTTTCAT 

25 TATTGGGTGA AATCTCATAG TGAAGCCATT AAATAATTTG GGCTCAAGTT 
TTATTAGTAA AGTCTGCATG AAATTTAACT TAACAATAGA GAGAGTTTTC 
GAAAGGGAGC GAATGTTAAA AAGTGTGATA TTATATTTTA TTTCGATTAA 
TAATTATGTT TACATGAAAA CATACAAAAA AATACTTTTA AATTCAGAAT 
AATACTTAAA ATATTTATTT GCTTAATTGA TTAACTGAAA ATTATTTGAT 
TAGGATTTTG AAAAGATCAT TGGCTCTTCG TCATGCCGAT TGACACCCTC 
CACAAGCCAA GAGAAACTTA AGTTGTAAAC TTTCTCACTC CAAGCCTTCT 
ATATAAA CAT GTATTGGATG TGAAGTTATT GCATAACTTG CATTGAACAA 

30 TAGAAAATAA CAAAAAAAAG TAAAAAAGTA GAAAAGAAAT ATGr 
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LbC2 with the sequence: 



ID 



TCGAGTTTTT 
TTTATTCGGC 
ATCCCCACCC 
TTATTTCTTA 
5 ATAGTGAACA 
TTATATTTTT 
ACTATTAAAT 
TTAACTTAAT 
GTGATATTAT 
TTGACAATTT 
TTTAAGATTT 
CTCCACAAGC 
TCTATATAAA 



CAATAGAAAT 



ACTGAACATA 
GAGAAGCCTT 
CCACCAAAAA 
TTTTTACAAA 
TCATTTTTTT 
TTGTTACCCT 
AGTTTGGGCT 
AATAGAGAGA 
TATAGTTTTA 
ATTTTTAAAA 
TGAAAAGATC 
CAAGAGAAAC 
CACGTATTGG 
AACAACAAAG 



CATTTATTAA 
CTCGTGCTTT 
AAAAAAAACT 
GGAAACTTCA 
AGTTAAGATG 
TTTCATTATT 
CAAGTTTTAT 
GTTTTGGAAA 
TTTAGATTAA 
TTCAGAGTAA 
ATTTGGCTCT 
TTAAGTTGTA 
ATGTGAAGTT 
AAAATAAGTG 



AAAAAACTCT 
ACACACXTTA 
GTTATATCTT 
CGAAAGTAAT 
AATTTTAAAA 
GGGTGAAATC 
TAGTAAAGTC 
GGTAACGAAT 
TAATTATGTT 
TACTTAAATT 
TCATCATGCC 
ATTTTTCTAA 
GTTGCATAAC 
AAAAAAGAAA 



CTAGTGTCCA 
ATATTATTAT 
TCCAGTACAT 
TACAAAAAAG 
TCACACTTTT 
TCATAGTGAA 
TGCATGAAAT 
GTTAGAAAGT 
TACATGAAAA 
ACTTATTTAC 
GATTGACACC 
CTCCAAGCCT 
TTGCATTGAA 
TATG, 



and Lbc3 with the sequence; 



TATGAAGATT 
GTACTATTTA 

15 GTAGATTTAT 
ATAAAAATAG 
AAATATAATT 
TTGTTTAAAT 
TATAAAAAAA 
CATTATATTA 

2o AATTTTAACT 
GTGATATTAG 
CTAAAAAAAT 
TTATTTACTG 
TTCACCATAC 
GTTTTATTAG 
TTGGATGTGA 

91* CAGAAAAGTA 



AAAAAATACA 
AGAAAAGAAA 
TTCTTTTATT 
TGAACATCGT 
TTTTTGTCTA 
TGGATAAGAT 
ATTGTTTCCC 
AAAAAATTAG 
TAAAAATAGA 
AAATTTGTCG 
ATATATTAAA 
AAAATGAGTT 
CAATTGATCA 
TTATTCTGAT 
AGTTGTTGCA 
GAAAAGAAAT 



CTCATATATA 
AAAAAAACCT 
TTTATAAAGG 
CTAAGCATTT 
AATCGTATGT 
CACACTATAA 
TTTTGATTAT 
GGCTCAATTT 
GAAAATCTGG 
GATATATTAA 
ATTTTAAATT 
GATTTAAGTT 
CCCTCCTCCA 
CACTCTTCAA 
TAACTTGCAT 
ATG. 



TGCCATAAGA 
GCTACATAAT 
AGAGTTAAAA 
TTATATAAGA 
ATCTTGTCTT 
AGTTCTTCCT 
TGGATAAAAT 
TTATTAGTAT 
AAAAGGGACT 
TATTTTATTT 
CAGAATAATA 
TTTGAAAAGA 
ACAAGCCAAG 
GCCTTCTATA 
TGAACAATTA 



ACCAACAAAA 
TTCCAATCTT 
AAATTACAAA 
TGAATTTTAA 
AGAGCCATTT' 
CCGAGTTTGA 
CTCGTAGTGA 
AGTTTGCATA 
GTTAAAAAGT 
TATATGGAAA 
CTTAAATTAT 

^1 ^* 

AG AG AC AT AA 
TAAATAA GTA 
ATAGAAATAA 



A further embodiment of the method according to 
the invention uses a DNA fragment identical with, 
derived from or comprising 5' flanking regions of 
the Lbc3-5 ' -3 ' -CAT gene with the sequence: 



30 TATGAAGATT AAAAAATACA 
GTACTATTTA AGAAAAGAAA 
GTAGATTTAT TTCTTTTATT 
ATAAAAATAG TGAACATCGT 
AAATATAATT TTTTTGTCTA 
TTGTTTAAAT TGGATAAGAT 



CTCATATATA TGCCATAAGA ACCAACAAAA 
AAAAAAACCT. GCTACATAAT TTCCAATCTT 
TTTATAAAGG AGAGTTAAAA AAATT ACAA A 
CTAAGCATTT TTATATAAGA TGAATTTTAA 
AATCGTATGT ATCTTGTCTT AGAGCCATTT 
CACACTATAA AGTTCTTCCT CCGAGTTTGA 
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TATAAAAAAA ATTGTTTCCC TTTTGAXTAT TGGATAAAAT CTCGTAGTGA 

CATTATATTA AAAJVAATTAG GGCTCAATTT TTATTAGTAT AGTTTGCATA 

AATTTTAACT TAAAAATAGA GAAAATCTGG AAAAGGGACT GXTAAA?JVGT 

GTGATATTAG AAATTTGTCG GATATATTAA TATTTTATTT TATATGGAAA 

CTAAAAAAAT ATATATTAAA ATTTTAAATT CAGAATAATA CTTAiiATTAT 

5 TTATTTACTG AAAATGAGTT GATTTAAGTT TTTGAAAAGA TG A TT G TC^C 

TTCACCATAC CAATTGATCA CCCTCCTCCA ACAAGCCAAG AGAGACATAA 

GTTTTATTAG TTATTCTGAT CACTCTTCAA GCCTTC TATA TAAA^^^AGTA 

TTGGATGTGA AGTTGTTGCA TAACTTGCAT TGAACAATTA ATAGAAATAA 
CAGAAAAGTA GAATTCTAAA ATG 

lO A s^lll further preferred embodiment of the method 
according ta the Invention uses a DNA fragment 
Identical with, derived from or comprising 5' flank- 
ing regions of the N23 gene with the sequence 



lO 20 30 40 so 60 70 

GIUWgTC GllGCTCGCCCGGGGngCGarOCTCT&GR CTCGAC CTGaUSCCCJ^ 
wpw ypr SalX 

80 90 100 lip 

25 TSCIATTG&GACJU:GATTTGAACAAZTTTTACAXTA<r 




220 230 240 250 2€0 270 280 

ATGAAXGCrATGATATTGATGGTCTTGATliTATTimCAGJUaTGI^^ 

290 30O 3X0 320 330 340 350 

AGAAGTTAGCACACCAATAGAAGXATTGAGTTATATTAAAACTTSAC^TT^^ 

360 370 380 390 400 410 420 

CAXATASAATTTZATTGACAASCCTTASAACAGTTGCXACXGTTGRAAGAOGSTCXTCAAM 

430 440 450 4€0 470 480 490 

20 ACTTAAATCATATCTAAAATCAACAATOTTACAAGAlAGRXTGAAXGAfiTTAGTTATTTTAT^ 

500 510 520 530 540 550 560 

AGTAAAGrGTXAGAATTGTTTGKZTATAAAACTCX6AXAAAr6ArTTTGCAGXTAAAAAAACTA6AA(^ 

570 580 590 60 O. .610 . 620 g30 

TAATATAAAAATTGAXAXTTTATATAATASATTAAGTCTCTTTAAAAYTCTTGTAAAAAA^^ 

640 650 660 670 680 690 700 

AAASAATA&AATAAACCAACTCITAATTTTAATGAAACAZCCC^^ 

710 720 730 740 750 760 770 

AAAAATTAATGCTTGATGGAAGTTTTTAArrTGTTCTACTCAi^ACTCAAAGGGTTCT 



780 790 800 8l0 820 830 840 

25 TATCaTTTATATGtTGTAAAIIAT(aU»!GCACXAGTAATTA03mAAT<SATAA^ 
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850 860 870 880 890 900 910 

ATTTC»raTCTCTTOGCAA£TCGTGAGAATT<»UlTiVrATTA^ 

920 930 940 950 960 970 960 

AGJUlTAiUlTATTTAXATACAATTCCTAGATTTTGTTATJU^TTCACATATTGTATGAGTATAAik^ 

990 XOOO lOlO 1020 1O30 1040 lOSO 

GAGCACACACCAAACTAGTCTCAAATTAAGTAAGGTGCTAATTAITAGCGGCTAGCTAAG^ 

OdeZ 

ATTAATG 

5 In a particularly preferred embodiment of the method 
according to the invention a 3' flanking region of 
root nodule-specific genes is furthermore used, in 
particular sequences of the 3' flanking region ca- 
pable of influencing the activity or regulation of 
lO a promoter of the root nodule- specif ic genes or 
the transcription termination, or capable of in- 
fluencing the yield of the desired gene product in 
another manner. 

Examples of such 3' flanking regions are the four 
15 3' flanking regions of the soybean leghemoglobin 
genes » viz. 

Lba with the sequence: 



1590 1620 
TAA TTA GTA TCT ATT GCA CTA AAG TGT AAT AAA TAA ATC TTG 

1650 

TTT CAC TAT AAA ACT TGT TAC TAT TAG ACA AGG GCC TGA TAG AAA ATC TTG GTT AAA ATA 

1710 1740 
20 ATG GAA TTA TAT ACT ATT GGA TAA AAA TCT TAA GOT TAA TAT TCT ATA TTT GCG TAG GTT 

1770 

TAT GCT TGT GAA TCA TTA TCG GTA TTT TTT TTC CTT TCT GAT AAT TAA TCG GTA AAT TA 

1830 IBBO 
ACA AAT AAG TTC AAA ATG ATT TAT ATC TTT CAA AAT TAT TTT AAC AGC AGG TAA AAT CXT 



ATT TGG TAC GAA AGC TAA TTC GTC GA 
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Lbc^ wl^h the sequence: 



1320 

TAA/TT AGG ATC TAG TGC ATT GCC GTA 



1350 1380 
AAG T6T AAT AAA ZAA ATO TT6 TTT CAA CTA AAA CTT GTT ATT AAA CAA 6TT CCC TAT ATA 

1410 1440 
AAT GTT GTT TAA AAT AA6 TAA ATT TCA TTG TAT TGG ATA AAC ACT TTT AAG TTA TAT ATT 

1470 ISOO 
5 TCC ATA TAT TTA CGT TTG TGA ATC ATA ATC GAT ACT TTA TAA AAA TAA ATT CCA AAT AAT 

TTA TAC GTT TTA AAA ATT ATT TT 



Lbc2 with ehe sequence: 



TAG/GAT CTA CTA TTG CC6 TCA AGT 

Z140 

GTA ATA AAT AAA TTT TGT TTC ACT AAA ACT TGT TAT TAA ACA AGT CCC CGA TAT AXA AAT 

1170 1200 

20 <^ 6GT TAA AAT AAG TAA ATT ATA C6G TAT TGA TAA ACA ATC TTA AGT TTT ATA TAT AGT 

1230 1260 

TCC AXA TAC TAA AGT TTG TGA ATC ATA ATC GA 

1290 



and Lbc3 with the sequence: 



TAG/GAT CTA CAA TTG CCT TAA AGT GTA AXA AAT AAA 
990 1020 

TAT TAT TTC ACT AAA ACT TGT TAT TAA ACC AAG TTC TCG ATA TAA AXG TTG GTT AAA CTA. 

1O50 lOBQ 

IC AGT AAA TTA TAT GGT ATT GGA TAA ACA ATC TTA AGC TT 

lllO 



This sequence is positioned on the 0.9 Kb 3' flan- 
king region used according to the invention. A 
particular embodiment of the invention is therefore 
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the use of sequences of this region exerting or 
mediating the regulation characterised by the in- 
vention of root nodule-specific promoter regions. 

In a preferred embodiment of the method according 
5 to the invention a region is used of the coding 
sequence or intervening sequence of root nodule- 
specific genes, in particular sequences of the 
coding sequence or the intervening sequence capable 
of influencing the regulation of a promotor of the 
lO root nodule-specific genes or capable of influenc- 
ing the yield of the desired gene product in another 
manner • 



Examples of such coding sequences and intervening 
sequences are the four leghemoglobin genes of soy- 
15 bean , viz . 



Lb a with the sequence: 120 

ATG/GTT 

ISO 

GUI ASP AXA LEO VAL SER SER SBR PHB GLU ALA PHE LYS ALA AS» 
2S^GCTTOGGTCAGTAGCTCATTCGAA<kATO 

210 240 





ALA 


PHB 


THR GLU 




GCT 


TTC 


ACT 


GAG 




ZLE 


PRO 


GLN 


TYR 




ATT 


CCT 


CAA 


TAG 


20 


CCA 


TTC 


TAT 


GTT 




GAG 


TGG 


TTT 


TGG 




LEU 


PHE 


SER 


PHE 




TTG 


TTC 


TCA 


TTT 




GLU 


LYS 


LEU 


PHE 




GAA 


AAG 


CTT 


TTT 




TTA 


ATT 


TTA 


AGA 


25 


TGT 


TTG 


AAT 


TGT 




TAT 


TAG 


TAT 


TTG 



270 



330 



390 



450 



510 



570 



6 30 



300 

r GTT TGA AAA AAG ATA TAT TGT TAA TGT 

360 

ILE LEU GLU LYS ALA PRO ALA ALA LYS ASP 
ATA CTG GAG AAA GCA CCT GCA GCA AAG GAC 

420 

0 THR ASK PRO LYS LEU THR GLY HIS ALA 
C ACT AAT. CCT AAC CTC ACG GGC CAT GCT 

460 

fiC TAA AAT TAT AAC TAT TTT ATG TGA 

540 

C.ACT CTT AAA ACA TCA ATG AAC ATT AAT 

600 

C TTG AAC TAG GAA TAG TAT ATA AAT TTC 

660 
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690 720 
VAL AR6 ASF SER AIA GLY G£M ISO LYS ALA SER GLY TRR VAL VAL ALA 
TTT TGA ATT GTAG/GTG CGT GAC TCA GCT GGT CAA CTT AAA GCA AGT GGA ACA GTG GT6 GCT 

750 780 
ASP ALA ALA LEU GLY SER VAL BIS ALA GLN LYS ALA VAL THR ASP PRO GXJSt PHE VAL 
GAT GCC GCA CTT GGT TOT GIT CAT GCC CAA AAA GCA GTC ACT GAT CCT CAG TTC GTG/GT 

810 840 
ATG ATA AAT AAT GAA AT6 TTA TAA TAA ATT ATG CAT ACT TCA ATT TTT CAT GGA GCA GTA 

870 90O 
TAA TGA TCA ACA CAC ACT TCT TTT GTT TCA T6C ATT TGA TAA CTA CAA TCT TAA AAT GTT 

930 9SO 
GCA ATC TTA AAA ATA GTA TTA AAA ASA TAA CAT TTA ATT AGC TCA TCA ATA TTT TTC TGT 

990 1020 
TGC AAT TTT TTA TGA AAA AAT TAT AAT TAT GAA TTC TTT GAG CAA TGT TTA ATT AAA AAA 

1050 1080 
TTC ATT TAA TAA TGA AAT AAC TAA GCT ACC TCT GTC TCG TTT TTC ATT TAA ACT ATG ACA 

mo 1140 
TAA ACA ATG AAT AAA GTA AAC TAA ACC ATG ACA TGT TTA TTT TTG AAT GAG CTT ATT AAT 

1170 X200 
AAT TTT TTT TCA CTA TCT ATT GCA ATG TTC ATT GAT TAT CAA TTA TCT TGG TTG CAT TGA 

1230 

TTC TCT CGA TTT TTT TCT TGA GGT TAA GCT TCA GTT CAA TAT ATA TTC. ATT TTT TGA TAA 

1290 1320 
AAA AAA ATA GTA CAA TAT AIT TTC ATT TAG CT6 ATC ATA TTT ATT TAA GTT CAA CTT AAA 

1350 1380 
ATT TTA TAG ATG TTA ATT GAT ATA ATT TGT TGA GAT GAT GAG AAG ACC AAT ACC ATT ACG 

1410 1440 
TAC TCT TTT GAA AGT GTT ATA TGG ATT TTA ATT ATA AGG.AAA AAT GTA AGA GCT AAA CCA 

14 70 1500 
VAL VAL LYS GLU ALA LEO LEO LYS THR ILE LYS ALA ALA VAL 
TTG CTG ATG. ATT TTG AAG/GTG GTT AAA GAA GCA CTG CTG AAA ACA ATA AAG GCA GCA GTT 

1530 1560 
GLY ASP LYS TRP SER ASP GLO LEU SER ARG ALA TRP GLO VAL ALA TYR ASP GLO LEO ALA 
GGG GAC AAA TGG AGT GAC GAG TTG AGC CGT GCT TGG GAA GTA GCC TAC GAT GAA TTG GCA 



ALA ALA ILE LYS LYS ALA 
GCA GCT ATT AAG AAG GCA TAA 



The amino acid sequence of t:he Lba protein Is in- 
dicated above the coding sequence, 



Lbc]^ with the sequence: 
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lao 
Gz.y 

ATG/GCT 

210 240 
ALA FHE THR GLD LYS GLtl GLU ALA LEU VAL SER SER SER PHE GLU ALA PRE LYS ALA ASN 
GCT TTC ACT GAG AAG CAA GaG GCT TTG GTG AGT AGC TCA TTC GAA GCA TTC AAG GCA AAC 

270 300 
XLE PRO GLN 7VR SER VAL VAL PHE TYR ASN SER 

ATT CCT CAA TAG AGC GTT GTG TTC TAC AAT TC/CTAA GTT TTC TCT ATA AGC ATG TGT CTT 

330 360 
TCA TTC -TAT GTT TTT CTT -CTG GAA ATT TTT TGT GTT TGA AAA AAG ATA TAT ATA TAT ATA 

390 420 
5 TAT ATA TAT ATA TAT ATA TAT ATA TAT ATA TAT ATA TAT TTT GTT AAT GTG ACT GGT TTT 

450 480 
XLE LEU GLU LYS ALA PRO ALA ALA LYS ASP LEU PRE SER 
GGT TTG ATT AAA AAT AAA TA6/6ATT CTG GAG AAA GCA CCT GCA GCA AAG GAC TTG TTC TCA 

SIO 540 
PHE LEU ALA ASN GLY VAL ASP PRO THR ASN PRO LYS LEU THR GLY HIS ALA GLO LYS LEU 
TTT CTA GCA AAT GGA GTA GAC CCC ACT AAT CCT AAG CTC ACG GGC CAT GCT GAA AAG CTT 

570 «00 

PHE ALA LEU 

TTT GCA TT6/GT AAG TAT CAG CCA ACT AAA ATT ATA ACT ATT TTA TGT GAT TAA TTT TAA 

630 €fiO 
CAT TAA ACA TCA TGT ATT TTA ACA CTC TTA AAA TAT CAA TGA ACA TTA ATT TTT TGA ATT 

690 f20 
lO GTA TTT TAT ATT TTT ACC ATA TCT TGA ACT AG6 AAT AAT ATA TAA ATT TCT ATT ACT ATT 

750 7BO 
TGT TGG TAA TTA CAT ATA TAT ATA TAT ATA TAA TCC TTG TGA TAA TTA TTT TTC GAA TTT 

810 840 
VAL ARG ASP SER ALA GLY GLN LEU LYS THR ASN GLY THR VAL VAL ALA ASP ALA ALA 
GTAG/GTG CGT GAC TCA GCT GGT CAA CTT AAA ACA AAT GGA ACA GTG CTG GCT GAT GCT GCA 

870 900 

LEU VAL SER ILE HIS ALA GLN LYS ALA VAL THR ASP PRO GLN PRE VAL 

CTT GTT TCT ATC CAT GCC CAA AAA GCA GTC ACT GAT CCT CAG TTC GTG/GT ATG ATA AAT 

930 966 
AAT ACT AST AAA ATG TTA CAA TAA ATG CAA ACT TAA GTT TTA CGT ACA TAG TGA TCA TGA 

990 1020 
15 CTT CAT GCA TGG CTA TTA TTT TTT CAT ATT TAT TGA AGT CAA CTT AAA ATT TTG TAA ATA 

1050 1080 
CAG ATC GAT GCT AGT AAT TTG TTG AGA TCA TGA GAA AAC GTA CCA CTA CTC CAA TAG CAT 

1110 1"0 
TAC TCA TTT TGA AAA TTG TAT AAC TGT GAT CTA ATT ATA A6G AAA AAG TGT ATA TAA G«3 

1170 120O 
VAL VAL LYS. GLU ALA LEU LEU LYS THR 
CTA ATC CAT TAT TAA TGT TTT TTA TAT TTT GTAG/GTG GTT AAA GAA GCA CTG CTG AAA ACA 

1230 1260 
ILE LYS GLU ALA VAL GLY GLY ASN TRP SER ASP GLU LEU SER SER ALA TRP GLU VAL ALA 
ATA AAG GAA GCT GTT GGC GGC AAT TGG AGT GAC GAA TTG AGC AGT GCT TGG GAA GTA GCC 

1290 

TYR ASP GLU LEO ALA ALA ALA ZLE LYS LYS ALA 
20 TAT GAT GAA TTG GCA GCA GCA ATT AAA AAG GCA TAA 



The amino acid sequence of the Lbc]^ protein is 
indicated above the coding sequence. 
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Lbc2 with the sequence: 

G/GGT 
180 

AIA FBB THR 6LU LYS 6LN GLU ALA LEU VAL SER SER SER PHE GLU AIA PHE LY5 AUl ASN 
GCT TTC ACT GAG AAG CAA GAG 6CT TTG GZG AGT AGC TCA TTC GAA GCA TTC AA6 6CA AAC 

210 240 

ILE PRO GUi TYR SER VAL VAL PHE TYR THR SER 

AlT CCT CAA TAC AGC GTT 6TG TTC TAC ACT TC/GTA AGT TTT CTC TTA AAG CAT GTA TCT 

270 300 

5 TTC ATT CTC TGT TTT TCC TTT CGA CAT TTT TTG TGT TTG AAA A6A GAT AGT GTC AAT GT6 

330 360 

ILE LEU GLU LY8 ALA FRO ALA ALA LYS 
AGT GGG TAT TTT TTT TTA TTA AAA ATT AAC AG/G ATA CTG GAG AAA GCA CCC GCA GCA AAG 

390 420 

ASP LEU PHE SER PHE X£U SER ASN 6LY VAL ASP PRO SER ASN PRO LYS ££0 THR GLY HIS 
GAG TTG TTC TC6 TTT CXA TCT AAT GGA GTA GAT OCT AGT AAT OCX AAG CTC ACG G6C CAT 

450 480 

ALA GLU LYS LEU PHE GLY LEU 

GCT GAA AAG CTT TTT GGA TTG/GT& AGT ATC ATC CAA CTA AAA TTA TAG CTA TTT TAT GT6 

510 540 

lO ATT AAT TTT AAG ATT AAA CAT GTA TTT AAC ACT CTT AAA CAT GTA TTT AAC ACT CTT AAG 

570 600 

ATT AAA CAT GTA TTT AAC TAA AAC ATG TAT TTG CTG ATT ATT TTT TTT TTA TAA TTA TCT 

630 660 

VAL ARC ASP SER ALA GLY GLN LEU LYS ALA 
TGT CAC ATA TTA TAT ATT TTT T6A ATT GTA G/GTG CGT GAC TCA GCT GGT CAA CTT AAA GCA 

690 720 

ASN GLY THR VAL VAL ALA ASP ALA ALA LEU GLY SER ZLE fiZS ALA GUV LYS ALA Z£S THR 
AAT GGA ACA GTA GTG GCT GAT 6CC GCA CTT GGT TCT ATC CAT GCC CAA AAA GCA ATC ACT 

750 780 

1 c ASP PRO GLN PHE VAL 

GAT CCT CAG TTC GXG/GT ATG ATA AAT AAT AAA ATG TTA CAA TAA ATG CAC ATA TAC TTA 

810 840 

AAT TTT ACA TGG TGC AGT GTT ATG ATC ATC ATT TTT GTT TAG TAA TGA ATT TAC TTA AAA 

870 900 

TCT TAA ATT ATG TAC TTT TTG AAA GTT TTA TAT GGA ATT TTA ATT ATA GGG AAA AAT GTA 

930 960 

^ VAL VAL LYS GLU ALA LEU LEU LYS THR 

AGA CCT AAT CCA TTA GTG ATG TTT TGT CTG TAG/GTG GTT AAA GAA GCA CTG CTG AAA ACA 

990 1020 

I£ LYS GLU ALA VAL GLY ASP LYS TRP SER ASP GLU LEU SER SER ALA TRP GLU VAL ALA 
ATA AAG GAG GCA GTT GGG GAC AAA TGG AGT GAT GAA TTG AGC AGT GCT TGG GAA GTA GCC 

1050 X080 



20^R ASP GLU LEU ALA ALA ALA ZLE LYS LYS ALA PHE 

TAT GAT GAA TTG GCA GCA GCT ATT AAG AAG GCA TTT TAC 

1110 
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The amino acid sequence of the Lbc2 protein is 
indicated above the coding sequence. 



and Lbc3 with the sequence: 



GLY AIA PBS THR ASP 
G/6GT GOT TTC ACT GAT 
120 



LYS GLN GLU ALA LEU VAL SER SER SER PHE GLU ALA PHE LYS THR ASN ILE PRO 6LN TYR 

- TTT m»« «mm t*mm iPK 

150 



« AA6 CAA GAG GCT TT6 GTG AGT AGC TCA TTT GAA GCA TTC AA6 ACA AAC ATT CCT CAA TAC 
» ISO loO 



ACT CTT GTG TTC TAC ™C TC/GTA ACT ATT CTA TCT AAA TTA TGT GTC TTA TTG TAT GTT 

210 240 

TAA CTT TCG TGG TTT GTT GTG TTT GAA AAA AA6 ATA, TAT ATT GTT AAT GTG AGT GGT TTT 

270 

ZL£ LEU GLU LYS ALA PRO VAL ALA LYS ASP LEU PHE SER 
GGT TTG ACT AAA AAT GAA TAG/G ATA CTG GAG AAA GCA CCT GTA GCA AA6 GAC TTG TTC TCA 

330 

10 PHE IfiU ALA ASN GLY VAL ASP PRO THR ASH PRO LYS ™; 5I± 

^ TTT CTA GCT AAT GGA GTA GAC CCC ACT AAT CCT AAG CTC ACG GGC CAT GCT GAA AAA^CTT 

390 ^20 

PHE GLY LEU 

TTT GGA TT6/GT AAG TAT CCA GCC TAC TAA AAT TAA AAT CCT ATT AGT ATT TTT TAT TAT 

450 

VAL ARG ASP SER 

TTT TCT TCC ATG ATT GTC TTG TCA CAT ATT ATA TAT TTT TTG AAT TAT AG/GTA CGT GAT TCA 

510 540 

ALA GLY GLN LEU LYS ALA SER GLY THR VAL VAL ILE ASP ALA AIA «^ JJjE HIS ■ 

GCT GGT CAA CTT AAA GCA AGT GGA ACA GTG GTG ATT GAT GCC GCA CTT GGT TCT ATC CAT 

570 600 

Tc; ALA GLN LYS ALA ILE THR ASF PRO GLN PHE VAL 

■*^GCCCAAAAAGCAATCACTGATCCTCAATTT GTG/G TAT GAT AAA TAA T6A AAA GCT ACA 

630 . 

ATA AAT GCA CAA ATA CTT AAT TTT ACA TAG TGC AGT GCT ATA TGA TCA TCA CPT TTG CTT 

690 

AGT AAT GAA TTT ACT TTT TTT TTT TAC AGA AGT AAT GGA TTT ACT TAA AAT CTT AAA TTA 

750 

TGT ACT TCT TTA AAG AGT TTT GTA TGG AAT TTT AAT TAT AGG AAA AAT GTA AGA GCT AAA 

BIO 

VAL VAL LYS GLU ALA LEU LEU LYS THR ILE LYS GLU ALA 
CCA TTG CTG ATG ATT TCG AAG/GTG GTT AAA GAA GCA CTG CTG AAA ACA ATA AAG GAG GCA 

870 

2n VAL GLY ASP LYS TRP SER ASP GLU LEU SER SER ALA TRP GLU VAL ALA TYR ASP GLU IJBU 
GTT GGG GAC AAA TGG AGT GAC GAG TTG AGC AGT CCT TGG GAA GTA GCC TAT GAT GAA TTG 

930 , 



ALA ALA ALA ILE LYS LYS ALA PHE 
CCA GCA GCT ATT AAG AAG GCA TTT TAG 
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The amino acid sequence of the Lbc3 protein is 
indicated above the coding sequence. 

The present invention furthermore deals with a 
novel DNA fragment comprising an inducible plant 
5 promoter to be used when carrying out the method 
. according to the invention, said DNA fragment being 
characterised by being identical with» derived 
from or comprising a 5' flanking region of root 
nodule-specific genes. . Examples of such DNA 

10 fragments are DNA fragments being identical with^ 
derived from or comprising a 5' flanking region of 
plant leghemoglobin genes. Preferred examples are 
according to the invention DNA fragments being 
identical with, derived from or comprising a 5' 

15 flanking region of the four soybean leghemoglobin 
genes , viz . : 

Lba with the sequence: 



GAGATACATT 
GATATATACC 
TCTTTTATTT 
ATTTTGAAAA 
GGTTAAATCT 
GTAGAGTCTA 
AAAGTTGGTT 
7TTTTTTTGG 
25 CTAACCATAT 
TAATTAAAAA 
TCATCATGCT 
TGGTTTTCTC 
TTGTTGCATA 
AAAGAAATAT 



ATAATAATCT 
TTCTCGTATA 
TTATAAAAAA 
CATGCTCTTT 
CATAGTGCCT 
CATAAAATTT 
TTTCTCGAGG 
ATTAATAGTT 
TAAATTTAGA 
ATTATTTGAT 
GATTGACACC 
ACTCTCCAAG 
ACTTGCATCG 
G, 



CTCTAGTGTC 
CTGTTATTTT 
GACTTTATTT 
GACAATTTTC 
CTATTCAATA 
ACCTTAATAG 
AAGAAAGGAA 
ATGTTTATAT 
ACAACACTTC 
TAAATTTTTT 
CTCCACAAGC 
CCCT CTATAT 
AACAATTAAT 



TATTTATTAT 
TTCAATCTTG 
TTTTAAAAAA 
TGTTTCCTTT 
ATTTGGGCTC 
TAGAGAATAG 
ATGTTAAAAA 
GAAAACTGAA 
AATTATTTTT 
AAAAGATCGT 
CAAGAGAAAC 
AAA CAAA^TAT 
AGAAATAACA 



TTTATCTGGT 
TAGATTTACT 
AATAAAGTGA 
TTCATCATTG 
AATTTAATTA 
AGAGTCTTGG 
CTGTGATATT 
AATAAATAAA 
TTAATTTGAT 
TGTTTCTTCT 
ACATAAGCTT 
TGGAGTGAAG 
GAAAATTAAA 
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Lbcj^ with the sequence 



TTCTCTTAAT ACAATGGAGT TTTTGTTGAA CATACATACA TTTAAAAAAA 
AATCTCTAGT GTCTATTTAC CCGGTGAGAA GCCTTCTCGT GTTTTACACA 
CTTTAATATT ATTATATCCT CAACCCCACA AAAAAGAATA CTGTTATATC 
TTTCCAAACC TGTAGATTTA TTTATTTATT TATTTATTTT TACAAAGGAG 
2^CTTCAGAAA AGTAATTACA TAAAGATAGT GAACATCATT TTATTTATTA 
TAATAAACTT TAAAATCAAA CTTTTTTATA TTTTTTGTTA CCCTTTTCAT 
TATTGGGTGA AATCTCATAG TGAAGCCATT AAATAATTTG GGCTCAAGTT 
TTATTAGTAA AGTCTGCATG AAATTTAACT TAACAATAGA GAGAGTTTTC 
GAAAGGGAGC GAATGTTAAA AAGTGTGATA TTATATTTTA TTTCGATTAA 
TAATTATGTT TACATGAAAA CATACAAAAA AATACTTTTA AATTCAGAAT 
AATACTTAAA ATATTTATTT GCTTAATTGA TTAACTGAAA ATTATTTGAT 
TAGGATTTTG AAAAGATCAT TGGCTCTTCG TCATGCCGAT TGACACCCTC 
CACAAGCCAA GAGAAACTTA AGTTGTAAAC TTTCTCACTC CAAGCCTTCT 
ATATAAA CAT GTATTGGATG TGAAGTTATT GCATAACTTG CATTGAACAA 
TAGAAAATAA CAAAAAAAAG TAAAAAAGTA GAAAAGAAAT ATG, 



Lbc2 with the sequence: 



TCGAGTTTTT ACTGAACATA CATTTATTAA AAAAAACTCT CTAGTGTCCAi 

^ GAGAAGCCXT CTCGTGCTTT ACACACTTTA ATATTAXTAT! 

ATCCCCACCC CCACCAAAAA AAAAAAAACT GTTATATCTT TCCAGTACAT'. 

rnm-T'T'TC'^'^A TTTTTACAAA GGAAACTTCA CGAAAGTAAT TACAAAAAAG: 

AT'^GTGAACA TCATTTTTTT AGTTAAGATG AATTTTAAAA TCACACTTTT; 

TTATATTTTT TTGTTACCCT TTTCATTATT GGGTGAAATC TCATAGTGAA: 

ACTATTAAAT AGXTTGGGCT CAAGTTTTAT TAGTAAAGTC TGCATGAAAX. 

TTAACXTAAT AAXAGAGAGA GTTTTGGAAA GGTAACGAAT GTTAGAAAGT. 

GTGATATTAT XAXAGTTTTA TTTAGAXXAA TAAXTATGXT XACATGAAAA 

TTGACAATTT ATXTTTAAAA XXCAGAGXAA TACTTAAATX ACTXATTTAC 

TTT2U\GATTT TGAAAAGATC ATTTGGCTCT TCATCATGCC GATTGACACC 

XTCCACAAGC CAAGAGAAAC TTAAGTTGTA ATTTTTCTAA CTCCAAGCCX 

XC TATATAAA CACGTAXTGG ATGTGAAGTT GXXGCAXAAC TTGCAXTGAA 

CAAXAGAAAT AACAACAAAG AAAATAAGXG AAAAAAGAAA TATG, 



and Lbc3 with the sequence: 
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TATGAAGATT AAAAAATACA 
GTACTATTTA AGAAAAGIU^ 
GTAGATTTAT TTCTTTTATT 
ATAAAAATAG TGAACATCGT 
AAATATAATT TTTTTGTCTA 
5 TTGTTTAAAT TGGATAAGAT 
TATAAAAAAA ATTGTTTCCC 
CATTATATTA AAAAAATTAG 
AATTTTAACT TAAAAATAGA 
GTGATATTAG AAATTTGTCG 
CTAAAAAAAT ATATATTAAA 
TTATTTACTG AAAATGAGTT 
TTCACCATAC CAATTGATCA 
lO GTTTTATTAG TTATTCTGAT 
TTGGATGTGA AGTTGTTGCA 
CAGAAAAGTA GAAAAGAAAT 



Another ea:ample of a preferred DNA fragment: accord- 
ing to the Invention is a DNA fragment which is 
15 identical with, derived from or comprises 5' flank- 
ing regions of the Lbc3 - 5 ' - 3 ' CAT gene with the 
sequence 



TATGAAGATT AAAAAATACA CTCATATATA TGCCATAAGA ACCAACAAAA 
GTACTATTTA AGAAAAGAAA AAAAAAACCT GCTACATAAT TTCCAATCTT 
GTAGAT^TTAT TTCTTTTATT TTTATAAAGG AGAGTTAAAA AAATTACAAA 
20ATAAAAATAG TGAACATCGT CTAAGCATTT TTATATAAGA TGAATTTTAA 
* AAATATAATT TTTTTGTCTA AATCGTATGT ATCTTGTCTT AGAGCCATTT 
TTGTTTAAAT TGGATAAGAT CACACTATAA AGTTCTTCCT CCGAGTTTGA 
TATAAAAAAA ATTGTTTCCC TTTTGATTAT TGGATAAAAT. CTCGTAGTGA 
CATTATATTA AAAAAATTAG GGCTCAATTT TTATTAGTAT AGTTTGCATA 
AATTTTAACT TAAAAATAGA GAAAATCTGG AAAAGGGACT GTTAAAAAGT 
GTGATATTAG AAATTTGTCG GATATATTAA TATTTTATTT TATATGGAAA 
25 CTAAAAAAAT ATATATTAAA ATTTTAAATT CAGAATAATA CTTAiLVrTAT 
TTATTTACTG AAAATGAGTT GATTTAAGTT TTTGAAAAGA TGATTGTCTC 
TTCACCATAC CAATTGATCA CCCTCCTCCA ACAAGCCAAG AGAGACATAA 
GTTTTATTAG TTATTCTGAT CACTCTTCAA GCCTTCTATA TAAATAAGTA 
TTGGATGTGA AGTTGTTGCA TAACTTGCAT TGAACAATTA ATAGAAATAA 
CAGAAAAGTA GAATTCTAAA ATG 



32 



CTCATATATA 
AAAAAAACCT 
TTTATAAAGG 
CTAAGCATTT 
AATCGTATGT 
CACACTATAA 
TTTTGATTAT 
GGCTCAATTT 
GAAAATCTGG 
GATATATTAA 
ATTTTAAATT 
GATTTAAGTT 
CCCTCCTCCA 
CACTCTTCAA 
TAACTTGCAT 
ATG« 



TGCCATAAGA 
GCTACATAAT 
AGAGTTAAAA 
TTATATAAGA 
ATCTTGTCTT 
AGTTCTTCCT 
TGGATAAAAT 
TTATTAGTAT 
AAAAGGGACT 
TATTTTATTT 
CAGAATAATA 
TTTGAAAAGA 
ACAAG CCAAG 
GCCTTCTATA 
TGAACAATTA 



ACCAACAAAA 
TTCCAATCTT 
AAATTACAAA 
TGAATTTTAA 
AGAGCCATTT 
CCGAGTTTGA 
CTCGTAGTGA 
AGTTTGCATA 
GTTAAAAAGT 
TATATGGAAA 
CTTAAATTAT 
TGATTGTCTC 

TAAATAAG TA. 
ATAGAAATAA 



30 Still an ther example of such a DNA fragment ac 
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cording to the invention is a DNA fragment which 
is identical with, derived from or comprises 5' 
flanking regions of the N23 gene with the sequence 



10 20 30 40 50 60 70 

GAATTCGAGCTCGCCCTCGGATCGATCCTCTAGAGTCGACCTGCIU^CCCAAGCTTGGATCAATCAATO 

Bcxad sail 

80 90 loo no - 120 130 140 

5 TTCTATTGAGACACGATTTGIWlUIAIlTTTTTACATTATGAGIiCTATTTT^^ 

AAATTOAAA6CICTAGATGATGAT6AATTGAAMMAATATTGT 

220 230 240 250 260 270 280 

ATGAATGCTATGATATTGATGGTCTTGATNTATTNHCAGAATTGAAAOTATTAAGAGAAGTGTC 

290 30O 310 320 330 340 ^ ^^^^350 

10 AGAAGTTAGCACACCAATACaUiOPATTGAfiTTATATTAAAACTTTAGATTCT^^ 

360 370 380 390 4O0 410 420 

CATATAGAATTTTATTiakCAATCCTTATAACAOTTGCTACTGrTGAAAGAOGTTCTTCAAAATTAAAATT 

430 440 450 460 470 460 490 

ACTTAAATCATATCTAAMTCAACAATGTTACAAGATAGATTGAATGAfiTTAGTTATTTTATCT 

500 510 S20 530 540 550 S60 

15 AGrAAAGTGrTAGAATTQTTTGATTATAAAACTCTGATAAATGATTTTCCAGTTAAAAAAACTAGAA^ 

570 560 590 600 610 620 630 

TAATATAAAAATTGATATTTTATATAATATATTAAGTCTCTTTAAAATTOTTGTAAAAAAAGACAil^^ 

640 6SO 660 670 680 690 700 

AAATAATAAAATAAAGCAACTCTTAATTTTAATGAAACATCCCTTTGTTAAACCGAATCTTCCATAATGT 

710 720 730 740 750 760 770 

2Q AAAAATTAATGCTTGATGGAA G T T TT T AATTT GTTC T A CTCAATACTCAAAGCGTyGTAAAgATTTTTTT 

7BO 790 eOO 810 620 830 840 

TATCATTTATATCTTGTAAATATGAATGCACTAGTAATTAGTTTAATGATAAAATATAiPTCTAC^GATAT 

850 860 870 880 690 900 910 

TTTTTTTT 

920 930 940 950 960 970 . 980 

2 5 AGAATAAATATTTAT ATACAATTCCTAGATTTTGTTATAAAATTCACATATTGTATGACTATAAATACAT 

990 1000 lOlO 1020 103O 1040 1050 

GAGCACACACCAAACTAGTCTCAAATTAA6TAAGGTGCTAATTATTAGCGGCTAGCTAAGTAM 

Ddel 

ATTAATG 
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The InventlotL relates furthermore to any plasmid 
to be. used when carrying out the method according 
to the invention and characterised by comprising a 
DNA fragment containing an inducible plant promoter 
5 as herein defined. Particular examples of suitable 
plasmids according to the invention are pARll, 
pAR29» pAR30, and N23-CAT, cf. Examples 3, 4, and 
11. These plasmids allow recombination into the 
rhizogenes T DNA region. 

lO The Invention relates furthermore to any Agrobac- 
ter ium strain to be used in connection with the 
invention and characterised by comprising a DNA 
fragment comprising an inducible plant promoter of 
root nodule -specific genes built into the T DNA 

13 region and therefore capable of transforming the 
Inducible promoter into plants. Particular examples 
of bacterium strains according to the invention are 
the A, rhlzogenes strains AR1127 carrying pAR2 9 , 
AR1134 carrying pARSO^ ARIOOO carrying pARll, and 

20 AR204-N23-CAT carrying N23-CAT. 

It is obvious that the patent protection of the 
present Invention is not limited by the embodiments 
stated above. 

Thus the Invention employs not exclus Ively 5 ' flan- 
25 king regions of soybean leghemoglobln genes. It is 
well-known that the leghemoglobln genes of all 
legiualnous plants have the same, function, cf. Apple- 
by (1974) in The Biology of Nitrogen Fixation, 
Qulspel. A. Ed. North-Holland Publishing Company, 
3Q Amsterdam, Oxford, pages 499-554, and concerning the 
kidney bean PvLbl gene it has furthermore been 
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proved that a high degree of homogoly exists with 
the sequences of the soybean Lbc3 gene. It Is also 
known that the expression of other root nodule- 
specific genes is regulated in a similar manner 
5 like the leghemoglobin genes , The invention includes 
thus the use of 5' flanking regions of leghemoglobin 
genes or other root nodule-specific genes of all 
plants in case the use of such DNA fragments makes 
the expression of a desired gene product the subject 
2.0 matter of the regulation characterised by the pre- 
sent invention. 

The present invention allows also the use of such 
fragments of any origin which under natural con- 
ditions exert or mediate the regulation charac- 
ISterised by the present invention. The latter applies 
especially to such fragments which can be Isolated 
from DNA fragments from gene libraries or genomes 
through hybridization with labelled sequences of 5' 
flanking regions of soybean leghemoglobin genes. 

2olt is well-known that it is possible to alter nuc- 
leotide sequences of non- important sub-regions of 
5' flanking regions without causing an alteration 
of the promoter activity and the regulation. It Is 
also well-known that an alteration of sequences of 

25 important subregions of 5' flanking regions renders 
it possible to alter the binding affinities between 
nucleotide sequences and the factors or effector 
substances necessary or responsible for the trans- 
cription Inltatlon and the translation initiation 

3Qand consequently to improve the promoter activity 
and/or the regulation. The present invention in- 
cludes, f course, also the use of DNA fragments 
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containing such altered sequences of 5 ' flanking 
regions, and in particular DNA fragments can be 
mentioned wKich. have been produced hy recombining 
sequences of 5' flanking regions of any gene with 
5 5' flanking regions of root nodule- specif ic genes 
provided the use of such DNA fragments subjects 
the expression of a desired gene product to the 
regulation characterised by the present invention^ 

It should be noted that the transformation of micro- 
lO organisms is carried out in a manner known per se, 
cf. e.g. Haniatis et al. , (1982), Molecular Cloning, 
A Laboratory Kanual, Cold Spring Harbor Laboratory. 

The transformation of plant cells, i.e. introduction 
of plasmid DNA into plant cells, is also carried 
15 out in a manner known per se, cf. Zambryski et 
al., (1983), EMBO J. 2., 2143-2150. 

Cleavage with restriction endonucleases and di- 
gestion with other DNA modifying enzymes are well- 
known techniques and are carried out as recommended 
20 by the suppliers. 

The A grobacterium rhizogenes 15834 rif^ was used 
as a typical representative of A. rhizbgenes : see 
IThite et al., I.Bact., Vol. i41 (1980), 1134-1141. 

Example 1 

25 Sequence determination of 5' flanking regions o£ 

soybean JteRt^emoglo^lTi Renes 

From a soybean gene library the four soybean leg- 
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hemoglobin genes. Lba, Lbo^, LbC2, atid LbC3 are 
provided as described by Jensen, E.0, et al . , Nature 
Vol. 291, No. 3817, 677-679 (1981). The genetically 
stable in-bred invariable soybean species "Glycine 
5 max. var. Evans" was used as a starting material for 
the isolation of the DNA used for the construction 
of said gene library. The 5' flanking regions of 
the four soybean leghemoglob in genes are isolated, 
as described by Jensen, E.0., Ph D Thesis, Institut 
10 for Molekylar Biologi, Arhus Universitet (1985), 
and the DNA sequences determined by the use of the 
dideoxy method as described by Sanger, F. , J. Mol. 
Bio. 143. 161-178 (1980) and indicated in the se- 
quence scheme. 

15 Example 2 

CoTistruction of Lbc^ - 5 ^ - 3 ^ -CAT 

The construction has b^een carried out in a sequence 
of process steps as described below: 

a) Sub-clonlTi^ the Lbcf^ gene 

20 The Lbc3 gene was isolated on a 12Kb EcoRI restric- 
tion fragment from a soybean DNA library, which 
has been described by Wiborg et al . , in. Nucl. Acids 
Res. (1982) 10, 3487, A section of the fragment is 
shown at the top of the attached Scheme 2. This 

25 fragment was digested by the enzymes stated and 
then ligated to pBR322 as indicated at the Scheme. 
The resulting plasmids Lbc3HH and Lbc3HX were sub- 
sequently digested by PvuII and religated, which 
resulted in two plasmids called pLpHH and pLpHX. 
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b) Sub-cIonin|B; 5 ' flanking sequences from the Lbcg 

gene 

For this purpose pLpHU vas used as shown in the 

attached Scheme 3. This plasmid was opened by means 

5 of FvuII and treated with exonuclease Bal31. The 

reaction was stopped at various times and the 

shortened plasmids were ligated into fragments from 

pBR322. These fragments had been treated in advance 

as shown in Scheme 3, in such a manner that in one 

2o end they had a DNA sequence ^-^^ 

AAG ^ 

After the ligation a digestion with EcoRl took 
place, and the fragments containing 5' flanking 
sequences were ligated into EcoRI digested pBR322. 

15 These plasmids were transformed into E. coli K803 . 
and the plasmids in the transf ormants were tested 
by sequence analysis. A plasmid, p213 5 ' Lb , isolated 
from one of the transf ormants , contained a 5' flan- 
king sequence terminating 7 bp before the Lb AT6 

20 start codon in such a manner that the sequence is 
as follows: 

2Kb 

-5' flanking AAAGTAGAATTC 

Lbc3 sequence 

25 E. coli K803 is a typical representative of the E. 
coli K12 recipient strains. 



c) Sub-clon ing 3' flanking region of the Lbc3 

gene 
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For this purpose pLpHX was used which was digested 
by XhoII* The ends were partially filled out and 
excess single-stranded DNA was removed with SI 
nuclease, as shown in the attached Scheme 4. The 
5 fragment shown was ligated into pBR322 which had 
been pretreated as shown in the Scheme. The con- 
struction was transformed into E- coli K8Q3. One 
of the transf ormants contained a plasmid called 
Xho2a-3'Lb. As the XhoII recognition sequence is 
10 positioned immediately after the Lb stop codon, cf . 
Scheme 2, the plasmid contained about 900 bp of 
the 3* flanking region, and the sequence started 
with GAATTCTACAA . 

The construction of Lb promoter cassette 

15 An EcoRI/SphI fragment from Xho2a-3'Lb was mixed 
with a BamHI/EcoRI fragment from p213-5'Lb. These 
two fragments were ligated via the BamHI/SphI cleav- 
age sites into a pBR322 derivative where the EcoRI 
recognition sequence had been removed, cf. Scheme 

20^- The ligated plasmids were transformed into 
coli K803 . A plasmid in one of the transf ormants 
contained the correct fragments, and it was called 
pEJLb 5'-3'-l. 

Construction of the Lbc ^ 5^3^ -CAT gene 

25 The CAT gene of pBR322 was isolated on several 
smaller restriction fragments, as shown in the 
attached Scheme 5. The 5' coding region was isolated 
as an Alul fragment which was subsequently ligated 
into pBR322« treated as stated in the Scheme. This 
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was transformed into E> coll K803 . Several trans- 
formants contained the correct plasmid. One was 
taken^ out and called Alull. The 3' coding region 
was isolated on a TaqI fragment. This fragment was 
5 treated with exonuclease Bal31, whereafter EcoRI 
linkers were added. Then followed a digestion with 
EcoRI and a ligation to EcoRI digested pBR322. The 
latter was transformed into E> coll K803 and the 
transf ormants were analysed. A plasmid, Taq 12, 

lO contained the 3' coding region of the CAT gene 
plus 23 hp 3' flanking sequences subsequently term- 
inating in the following sequence CCCCGAATTC. 
Subsequently the following fragments were ligated 
together to EcoRI digested 

35 pEJLbS' -3' -1: EcoRI/PvuII fragment from Alul , 
PvuII/Ddel fragment from pBR322 and Ddel/EcoRI 
fragment from Taq 12. This ligation mixture was 
transformed into E . coli K803 - Several transf ormants 
contained the correct plasmid. One was taken out 

20 and was called pEJLb 5 '-3' CAT 15 . 



Cloning and integration of the soybean Lbc3 -5'-3^- 
CAT gene, 

25 Two EcoRI fragments (No. 36 and No. 40) of the Tj,- 
DNA region of A. rhizogenes 1 5834 pRl plasmid was 
used as "integration sites". Thus the Lbc3-5'-3- 
CAT gene was subcloned (as 3,6 Kb BamHI/Sall frag- 
ment) into two vectors pARl and pAR22 carrying the 

3Qab ve EcoRI fragments. The resulting plasmids pAR29 
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and pAR30 were separately mobilized into A, xh^" 
zogenes 15834 rif^ using a plasmid helper system; 
see E. van Haute et al . (1983), EMBO J. 3, 411- 
417. Neither pAR29 nor pAR30 can replicate in Agro- 
5 bacterium. Therefore the selection by means of 
rifampicin 100 ptg/ml and the plasmid markers spec- 
tinomycine 100 pg/ml , s trep tomyc ine 100 /ig/ml or 

kanamycine 300 Mg/ml will select ^ rhizpgenes 

bacteria having integrated the plasmids via homo- 

10 logous recombination through the EcoRI fragments 
36 or 40. The structure of the resulting Tl-DNA 
regions - transferred to the transformed plant 
lines L5-9 and L6-23 - has been indicated at the 
bottom of the attached Scheme 6. In this Scheme is 

15 furthermore for the L6-23 line shown the EcoRI and 
Hindlll fragments carrying the Lbc3 - 5 ' - 3 ' - CAT gene 
and therefore hybridizing to radioac tively labelled 
Lbc3-5 ' -3' -CAT DNA used as a probe, cf. Example 
4a,. 

20b^ 

Cloning and integrat ion of the soybean Lbc^ gene. 

The EcoRI fragment No. 40 has here been used as 
"integration site". The Lbc3 gene was therefore 
sub-cloned (as a 3,6 Kb BamHI fragment into the 

25pARl vector and transferred into the T^-DNA region 
as stated in a. The structure of the Tl-DNA region, 
transferred to the transformed plant line L8-35, 
has been shown at the bottom of the attached Scheme 
7. This Scheme furthermore shows the EcoRI and 

30HindIII fragments carrying the Lbc3 gene and there- 
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fore hybridizing with radioactively labelled Lbc3 
DNA used as a probe, c£. Example 4h.. 
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Example 4. 



Demonstr Atrf nn o f t^he soybean Lbc 3 - 5 ' - 3 ^ > CAT g^ne In 
tratisformed plan fcs of bird^s-foot trefq^Xi 



I 

n 



.a 

M-l CM 

m w p 

- 8 B 8 

X. 04 «M 



O 

o 

Ul 



u 
E 



15.4 K a* 

9 K8- 

3.7KB- 
2.sKfl- 

t.aKa- 



I - 6KB 



t-a.5Ka 



- aSKa 
aiKa 



lis 

<Q O 

H Ul t 
M 4J W 



m 
I 

a: «Mxt S 



0 



*. ♦ -I 



mil 



^13 



0249676 



44 

DNA extracted from transformed lines (L6-23) or 
untrans formed control plants and cleaved by the 
restriction enzymes EcoRI and Hindlll was analyzed 
by Southern-liybridization. Radioactively labelled 
5 Lbc3-5' -3' -CAT gene was used as a probe for d^emon- 
strating corresponding sequences in the transformed 
lines. The bands marked with numbers correspond to 
restriction fragments constituting parts of the 
Lbc3 - 5 ' - 3 ' - CAT gene as stated in the restriction 
lO map (Scheme 6) of Example 3^. 

b. 



Demonstration of the soybean Lbc3 prene of trans > 
foymed plants of bi,y4^s-foot trefol,3.t 
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DNA extracted from transformed lines (L8-35) or 
untransf ormed control plants and cleaved by the 
restriction enzymes Eco?.I and Hindlll was analyzed 
by Southern-hybridization. Radioactive Lbc3 gene 
was used as a probe for detecting corresponding 
sequences in the transformed lines. The bands marked 
with numbers correspond to restriction fragments 
constituting parts of the LbC3 gene as stated in 
the restriction map (Scheme 7) f Example 3k- 
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Example 5 



Expression of the Lb&3 - 5 ^ - 3 ^ > CAT ^ene in various 
tissues of bird^s-foot trefoil. 



1 2 3 4 5 6 

Untransformed L6--23 

R N LS R N LS 



<- 3Ac Cm 
<-lAcCm 

<-Cm 



7 8 "9 .10 11 12 

L5-9 ■•- - Ibc, transformed 

R N LS " R N LS 
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The activity of the chloroamphenicol acetyl trans- 
ferase (CAT) enzyme is. measured as the amount of 
acetylated chloroamphenicol (AcCm) . produced from 
i^C-chloroamphenicol. In (a) the acetylated forms 
5 lAcCm and 3AcCm appear, which have been separated 
from Cm through thin- layer chromatography in chloro- 
form/methanol (95:5). The columns 1-3 show that no 
CAT activity occurs in root (R) , nodule <N) , as 
well as leaves + stem (LS) of untransf ormed plants 

10 of bird's-foot trefoil. The columns 4-6 and 7-9 
show the CAT activity in corresponding tissues of 
LbC3-5 ' -3 ' -CAT transformed L6-23 and L5-9 plants. 
The conversion of chloroamphenicol in columns 5 
and 8 shows the organ-specific, expression of the 

15 Lbc3-5 ' -3 ' -CAT gene in root nodules. The columns 
10-12 show the lack of CAT activity in plants trans- 
formed with the Lbc3 gene. 

b. 

L5-9 
CAT activity 
0 

154,000 cpt/yg protein -h 
0 



Table 

20 L6-23 

CAT activity 

Root 0 

Nodule 68830 cpo/yg protein-h 

Leaves + 

25 Stem 0 



In the Table (b) the CAT activity in Lbc3 - 5 ' - 3 ' - CAT 
transformed L5 - 9 and L6-23 plants has been stated 
as the amount of ^^C - chloroamphenicol converted 
into acetylated derivatives. The amount of radio- 
activity in the acetylated derivatives has been 
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count:ed by liquid sclnlrlllatloTi and stzated in cpm/Mg 
proCeln • hour. 

Exampla 6 

Transcription test: ^Northern analysis^ on Ulssues 
5 of Lbc3 ->5' -3^ -CAT transformed and Lbc^ transformed 
Lotus plant lines. 



1234 56789 to 
L5-9 U6-2a trassfcirmed 



a) 



b) 



-L- 



I 

■p 

j :-.28a 



^28s 
.Lb 
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5 /ig of total RNA extracted from root (R) , nodule 
(N) or leaves + stem (LS) and separated in formal- 
dehyde agarose gels were transferred onto nitro- 
cellulose. Column 1 contains 5 /ig of total RNA from 
5 20 -day- old soybean nodules as control plants. The 
columns 2l-4 and 5-7 contain total RNA from root, 
nodule or leaves + stem, respectively, of the Lbc3- 
5' -3' -CAT transformed lines L5-9 and L6-23. The 
columns 8-10 contain RNA from corresponding tissues 

10 of bird's-foot trefoil transformed by means of A^ 
rhlzQgenes carrying the Lbc3 gene in the Tl-DNA. 
In (a) radioactive DNA of the CAT coding sequence 
has been used as a probe for hybridization. The 
organ- specif ic transcription of the Lbc3-5'-3'- 

15 CAT gene in root nodules from the L5-9 and L6-23 
lines appears fr-om columns 3 and 6. In (b) the 
transcript for the : cons titutive ubiquitine gene(s) 
is visualized using a cDNA probe for the human 
ubiquitine gene for the hybridization. In (c) the 

20 nodule - specific transcription of bird's- foot trefoil 
own leghemoglobin genes is shown. A cDNA probe of 
the Lba gene of soybean has been used for this 
hybridization. 
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Determination of t:lie transcr Ip tian Initiation site 
(CAP site^ of the Lbc 3 promoter of soybean i n trans- 
formed r oot nodules of bird^s-foot trefoil. 
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The position of the "CAP sit^" was determined on 
the nucleotide level by means of primer extension. 
A synthetic oligonucleotide 5 ' CAACGGTGGTATATCCAGTG3 ' 
complementary to the nucleotides 15-34 in the coding 
5 sequence of the CAT gene was used as primer for 
the enzyme reverse transcriptase. As a result sin- 
gle-stranded cDNA was formed the length of which 
corresponds to the distance between the 5' end of 
the primer and the 5' end of the primed mRNA. A 83 

10 nucleotide cDNA strand would be expected according 
to the knowledge of the transcription initiation 
site of soybean Lbc3 gene. Columns 2, 3, and 4 
from left to right show the produced DNA strands 
when the primer extension has been operated on 

15 polyA+-purif ied mRNA from transformed root nodules 
of bird's-foot trefoil, transformed leaves + stem 
of bird's-foot trefoil, and untrans formed root 
nodules of bird's-foot trefoil, respectively. The 
85, 86, 87, 88, and 90 nucleotides long cDNA strand 

20 shown in column 2 proved correctly Lbc3 promoter 
function in bird's-foot trefoil. The CAP sites 
corresponding to the cDNA. sequences generated are 
indicated with asterisks (*) on the partial se- 
quence of the Lbc3 5'3'-CAT region given. In the 

25 sequence the TATA box of the Lbc3 promoter and the 
corresponding translation initiation . codon of the 
CAT coding sequence are underlined. 



0249676 



52 

Demonstratilon of the correct development al control 
of the Lbc 3 -5 " - 3 ^ >CAT ^ene In trans formed plants of 
bird's-foot trefoil 




I 



cam ox 



5 CAT activity 



1^ 

in cpm/zig protein -hour 0 0 32.6 342,3 1255 
Nitrogenase activity 

nmol ethylene//ig protein 0 O 0 0.5 2.7 

• hour 



lO * Substrate limited reaction; actual activity about 
68000 cpm/Mg protein • hour. 

Chloroamphenicol acetyl transferase and nitrogenase 
activity were measured on cut off pieces of root 
with nodules at the different developmental stages 

15 indicated. The CAT activity can be detected in the 
white distinct nodules whereas the nitrogenase 
activity did not appear until the small pink nodules 
have developed. The latter development corresponds 
to the development known from soybean control plants 

20 and described by Marcker et al . EMBO J. 1984, 3, 
1691-95. The CAT activity was determined as in 
Example 5. The nitrog nase activity was measured 
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as acetylene reduction capacity of the nodules 
followed by gaschromatographic determination of 
ethylene. 

Example 9 

5 Demonstrat^lon of Lbc 3 pt-nteln in bird's-foot trefoj,?. 
pIflTits transformed wi th the soybean Lbc ^ gene . , . 



2 3 4 5 6 7 8 _J._iq 




Proteins extracted from root nodules of LbC3 trans-, 
formed (L8-35), Lbc3 - 5 ' - 3 ' - CAT transformed and 
nontransformed plants were separated by isolectric 
focussing at a pH gradient of 4 to 5 . The aolumns 
1, 3, 5, 7, and 9 show Lbci, Lbc2, I-bC3, and Lba 
proteins synthesized in soybean control root nod- 
ules. Column 2 shows proteins from root nodules of 
LbC3-5' -3' -CAT transformed L6-23-bird' s-foot trefoil 
plants, whereas the columns 6 and 8 show proteins 
from nontransformed plants. The columns 4 and 10 
show soybean Lbc3 protein synthesized in root nod- 



02^9676 



54 

ules of bird's-foot trefoil plants (L8-35) trans- 
formed with the Lbc3 gene. The Lbc3 protein band 
is indicated by an arrow, 

E;?g,ftpipJ[,e J.Q 

5 Expression of th e Lbc3 - 5 > - 3 * - CAT gene reouires the 
5^ Lbc 3 promoter region. 

The Lbc3-5 ' -3 ' -CAT gene construction carries a 2 Kb 
5' Lbc3 promoter region. Stepwise removal of se- 
quences from the 5' end of this region' demons trated 
jQ that this promoter region is required for the char- 
acteristic expression of the Lbc3 - 5 ' 3 ' -CAT gene. 



5'Lbc, 3'Lbc, 



-3 '3 

2 Kb. 



S kll Xbal ' Sill 

I 



The Lbc3-5 ' -3 ' -CAT gene construction was opened in 
33 the unique Xbal site shown above, and digested with 
the exonuclease Bal31. A Sail linker fragment was 
ligated onto the blunt ends generated and the short- 
ened Sail fragments carrying the Lbc3 - 5 ' - 3 ' -CAT gene 
were transferred into L . corniculatus - The effect 
20^^ removing promoter sequences was measured as CAT 
activity- End points of the deleted 5' region are 
given as the distance from the CAP site in nucleo- 
tides . 
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CAT activity 

S'Lbc^ a'LbCj Cpm///g protein/hrs, 

I 2000 . ; J. 

' ' CAT ' 

-950 I ' . t ' 

-474 « 1 I f 



-230 

-78 "-C 



Root 
0 


Nodule 
80000 


Leaf 
0 


0 


10000 


0 


0 


3000 


0 


0 


3000 


0 


0 


0 


0 



5 The drastically reduced level of CAT activity ex- 
pressed from the Lbc3 promoter deleted to nucleotide 
-230 and the zero activity from the promoter deleted 
to nucleotide -78 demonstrates that the Lbc3 pro- 
moter region is required for the root nodule spe- 
lO cif ic expression of the Lbc3 - 5 ' - 3 ' - CAT gene. 



Example 11 



Construction of the N23-CAT gene . 



The N23 gene was isolated from a soybean DNA library 
as described in the enclosed paper of Sandal, Bojsen 

15 and Marcker. The N23-CAT gene was constructed from 
the modified Lbc3 - 5 ' - 3 ' - CAT gene carried on plasmid 
pEJ5 ' -3 ' -CATlOl as described in the Applicant's 
copending application No. 86 11 4704.9 concerning 
"Expression of Genes in Yeast", and a 1 Kb. EcoRI , 

2oDdeI fragment containing the N23 5' promoter region. 
The position of the EcoRI and Ddel sites in the 
N23 promoter region is indicated on the DNA sequence 
shown below. The cloning procedure used is outlined 
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below. The disclosure of the papers of Sandal et 
al., the EP application, and the paper of Jensen 
et al., Nature 321 (12 June 1986), 669-674. includ- 
ing the references cited should be considered in- 
5 corporated into the present description as a means 
to amend, illustrate, and clarify it. 

The N23-CAT gene was transferred to plants by the 
same method as the Lbc3 - 5 ' -3 ' -CAT gene.. 
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DNA sequence of the 5 ' -promo tor region from tlie 
N23 gene 



1ft 20 30 40 SO 60 70 

J." aTOCTCTAGA2ESSi5CTG(yUXCCAAGCTTGGATCAATCAATTAA 



lO 



SeOX 

220 230 240 2SO 260 270 280 
ATGAATGCTJmaTATTGATGGTCTTGATMTATTimC^^ 

300 310 320 330 340 350 

360 370 380 390 400 410 420 

CAa!ATJW»UiTTTT»!rTGACA^ 

430 440 450 460 470 480 490 

500 510 520 530 540 550 560 
15 AGTAAA6TGTTAGJUWTTGTTTGAIT»rAAAAClH^^ 

570 580 590 600 610 620 MO 

640 650 660 670 680 690 700 

AAATAATAAAATAAAGCAACTCCTAATTTTAATGAAACMCCCTTTOTO^ 

710 720 730 740 7 SO 760 770 

20 juuuuwttmSgctcgatg^ 

780 790 800 810 820 830 840 

TATCATTTATATGTlteTAAATATGAATGCACTAGTAMTAGTO 

8 BO 890 900 910 



920 930 940 950 960 970 980 

25 AGAATAAATATTTATATACAATTCCTAGATTTTCTTATAAJAT^ 

990 lOOO lOlO 1020 1030 1040 lOSO 

GAGCJU»CAOCIUUtfri»Ua?CrcaAKPTAAGTAJ^^ 



AXTAATG 
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Example 12 



Organ-specif Ic expressloti of the soybean N23-CAT 
ffene in root nodules of L . corn! culaHus and Trlfolium 
repens , 

5The activity of chloroamphenicol acetyl transferase 
(CAT) was measured as in example 5 and is given in 
cpm//ig protein/hrs . 



Table a. 



lO 



N23-CAT transformed 
L. corniculatus 



CAT activity 
Untr ans farmed 
L. corniculatus 



Root nodule 86150 
Root 0 



0 
0 



Table b. 



15 



N23-CAT transformed 

y, ripens. 



CAT activity 
Untrans formed 
T, renens 



Root nodule 
Root 



148000 
0 



0 
0 



Table (a) and b) shows the or gan- specif ic expression 
of the N23-CAT gene in root nodules of L. cornicu- 
20 latus and T.renens . L , corniculatus was inoculated 
with Rhizobium loti , while T . repens was inoculated 
with Rhizobium trifolii. 



In connection with the invention it has thus been 
proved that root no dule- specif ic genes can be ex- 
25 pressed organ- sped fically upon transfer to other 
plants, here Lotus corniculatus and Tri f olium 
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pens , It has furthermore been proved that the 5' 
flanking regions comprising the promoter are con- 
trolled by the organ- specif ic regulatory mechanism 
as the organ-specific control of the Lbc3 -5 ' -3 ' -CAT 
5 gene in Lotus GOTTiiculatus took place at the trans- 
cription level. The Lbc3 - 5 ' - 3 ' -CAT ^ene transferred 
was thus only transcribed in root nodules of trans- 
formed plants and not in other organs such as roots, 
stems, and leaves. 

lO The expression of the Lbc3-5 ' -3 ' -CAT gene in root 
nodules of transformed plants also followed the 
developmental timing known from soybean root nod- 
ules. No CAT activity could be detected in roots 
or small white root nodules (Example 8) . A low 
activity was present in the further developed white 
distinct nodules, whereas a high activity could be 
measured in the small pink nodules and mature nod- 
ules developed later on. 

The organ-specific expression and the correct de- 
2ovelopmental expression of transferred root nodule- 
specific genes, here exemplified by the Lbc3-5'-S'- 
CAT gene, allows as a particular use a functional 
expression of root nodule-specific genes also in 
other plants beyond leguminous plants. When all 
25the root nodule- specif ic plant genes necessary for 
the formation of root nodules are transferred from 
a leguminous plant to a non- root- nodule - forming 
plant species, the correct organ- specif ic expres- 
sion proved above allows production of functionally 
aoactive, nitrogen- fixing root nodules on this plant 
upon infecti n by Rhtzoblum . In this manner these 
plants can grow without the supply of external 
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inorganic or organic nitrogen compounds. Root nod- 
ule-specific promoters, here exemplified by the 
Lbc3 and N23 promoters, must be used in the present 
case for regulating the expression of the trans- 
5 f erred genes. 

According to the present invention a root nodule- 
specific promoter is used for expressing genes. 
The gene product or function of the gene product 
improves the function of the root nodule, e.g. by 
lO altering the oxygen transport, the metabolism, the 
nitrogen fixation or the nitrogen absorption. 

Root nodules are thus used for the synthesis of 
biological products improving the plant per se or 
which can be extracted from the plant later on. A 
3^5 root nodule - specif ic promoter can be used for ex- 
pressing a gene. The gene product or compound formed 
by said gene product constitute the desired pro- 
duct ( s ) . 

In connection with the present invention it has 
20 furthermore been proved that the soybean Lbc3 leg- 
hemoglobin protein per se, i.e. the Lbc3 gene pro- 
duct, is present in a high concentration in root 
nodules of bird's-foot trefoil plants expressing 
the LbC3 code sequence under the control of the 
25 Lbc3 promoter. The latter has been proved by cloning 
the genomic Lbc3 gene of the soybean into the in- 
tegration vector pARl, said genomic Lbc3 gene con- 
taining the coding sequence, the intervening se- 
quences, and the 5' and 3' flanking sequences. A 
303.6 Kb BamHI fragment Lbc3HH, cf. Example 2, was 
cloned into the pARl plasmid and transferred to 
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bird's-foot t:re£oll as st:at:ed previously. 

The high level of Lbc3 protein, cf. Example 9, 
£ound in tzransforsted root nodules of bird's -foot 
trefoil and corresponding to the level in soybean 
5 root nodules proves an efficient transcription of 
the Lbc3 promoter and an efficient processing and 
translation of Lbc3mRNA in bird's-foot trefoil. 

The high level of the CAT activity present in trans- 
formed root nodules is also a result of an efficient 

lO translation of mRNA formed from the chimeric Lbc3 
gene. The leader sequence on the Lbc3 gene is de- 
cisive for the translation initiation and must 
determine the final translation efficiency. This 
efficiency is of importance for an efficient syn- 

15 thesis of gene products in plants or plant cells. 
An Lbc3 or another leghemoglobin leader sequence 
can thus be used for increasing the final expression 
level of a predetermined plant promoter. The con- 
struction of a D£rA fragment comprising a Lb leader 

20S^^u^tice as first sequence and an arbitrary promoter 
as second sequence is a particular use of the in- 
vention when the construction is transferred and 
expressed in plants. 

During nodule development around 30 different plant 
25 encoded polypeptides (nodulins) are specifically 
synthesized. Apart from the leghemoglob ins , nod- 
ulins include nodule-specific forms of uricase 
(Bergmann et al (1983) EMBO . J. 2. 2333-2339), 
glutamine synthetase (Cullimore et al (1984) J.Mol. 
30Appl. Genetics 2, 589-599) and sucrose synthase 
(Korell and C peland (1985) Plant. Physiol. 78, 
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149-154). The function of most nodulins are, how- 
ever, at present unknown. 

Many nodulln genes have nevertheless been Isolated 
and characterised during the last five years. These 
5 include nodulins from several different legumes. 
Examples of such isolations and characterisations 
are widespread in the literature such as (Fuller et 
al (1983) Proc. Natl. Acad.Sci. 80, 2594-2598), 
(Sengupta-Gopalan et al (1986) Holec. Gen. Genet. 

10 203, 410-420), (Bisseling et al (1985) in Proceed- 
ings of the 6th Int. symp . on Nitrogen Fixation, 
Martinus Nijhoff Publishers pp 53-59.), and (Geb- 
hardt et al (1986) EMB0.J.5, 1429-1435). All of 
these genes contain nodule - specif ic regulatory 

15 sequences. Such sequences and in fact entire 5' 
flanking regions and 3' flanking regions can fur- 
thermore be synthesized by automated oligonucleotide 
synthesis knowing the DNA* sequences for the Lbc3 
and N23 genes given in this description. Entire 

20 nodule - specif ic genes can also be isolated with 
known recombinant techniques as described in the 
above papers and by (Maniatis et al (1982) Mole- 
cular cloning. A Laboratory Manual, Gold Spring 
Harbour Laboratory , New York). 

25 The. described method to obtain nodule - spec if ic 
expression of genes can thus be reconstructed and 
performed according to the invention by any one 
skilled in the art of molecular genetics. 

The method to obtain nodule - specif ic expression is 
30 not dependent on the A. rhizogenes plant transforma- 
tion described. Any other plant transformation 



0249676 



64 

system e.g. A. trumefaclens systems, direct gene 
transfer or microinjection can equally be applied. 

"^l^e 4j rhlzo|genes system has been used and charac- 
terised by a number of scientific groups and is 
5 thus well-known from the literature. The character • 
istics of the system is described in: 

Willmitzer et al . (1982), Molec.Gen. 
Genet. 186, 16-22, 

Chilton et al. (1982), Nature 295, 432-434, 

10 Simpson et al. (1986), Plant .Molec . Biol . 

6, 493-415, 

Tepfer D. (1983), Molecular Genetics of 
the Bacteria - Plant interaction. 

Springer Verlag, Berlin Heidelberg pp 
15 248-258, 

White and Nester (1980), J.Bact. 144, 
710-720, 

Jaynes and Strobel (1981), Int. Rev. of Cytol. 
Sup. 13, 105-125, 

20 White and Nester (1980), J. Bact. 141, 

1134-1141, 

Pomponi et al . (1983), Plasmid 10, 119- 
129, and 
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Slightom et al. (1986), J. Biol. Chem. 
261, 108-121. 

The latter two publications describe the restriction 
map and nucleotide sequence of the A. rhlzo^enes 
Tl-DNA segment used In the transformation system de- 
scribed here. With this Information It Is possible 
to anybody skilled In molecular genetics to use 
and reconstruct the "Intermediate vectors" and the 
A. rhlzogenes strains described here. 
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Claims : 

1. A iaet:hod of expressing genes In plantis , partis 
of plantis, and plant cell cultures by introducing 
Into a celX thereof a recombinant DNA segment con- 

5 talnlng bath the gene to be expressed and a 5' 
flanking region comprising a promoter sequence, 
and optionally a 3' flanking region, and culturlng 
of the transformed cells in a growth medium, 
characterised by using as *=*the recom- 
lO blnant DNA segment a DNA fragment comprising an 
inducible plant promoter (as defined) from root 
nodule- specif Ic genes . 

2. A method as claimed In claim 1, char- 
acterised by using a DNA fragment com- 

23 prising an inducible plant promoter (as defined) 
and being identical with, derived from or comprising 
5' flanking regions of root nodule-speclf Ic genes « 

3. A method as claimed In claim 2, char- 
acterised by using a DNA fragment com- 

2o prising an inducible plant promoter (as defined) 
and being identical with, derived from or comprising 
5' flanking regions of root nodule - specif ic genes, 
said DNA fragment causing an expression of a gene 
which is induced in root nodules at specific stages 

25 of development and as a step of the symbiosis, 
whereby nitrogen fixation occurs. 

4. A method as claimed in claims 1-3 for the 
expression of root nodule - specif ic genes, 
characterised by using a DNA fragment 

30 c mprising an inducible plant promoter (as defined) 
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from root nodule - specif ic genes. 

5. A method as claimed in claims 1-3 for the 
expression of genes in leguminous plants, parts of 
leguminous plants, and leguminous plant cell cul- 

5 tures, characterised by using a DNA 
fragment comprising an inducible plant promoter 
(as defined) from root nodule - specif ic genes. 

6. A method as -claimed in claims 1-5, char- 
acterised by the DNA fragment comprising 

10 the inducible plant promoter and being identical 
with, derived from, or comprising 5' flanking regions 
of leghemoglobin genes. 

7. A method as claimed in claim 6, char- 
acterised by the DNA fragment comprising 

15 the inducible plant promoter and being identical 
with, derived from or comprising 5' flanking regions 
of soybean leghemoglobin genes. 

8. A method as claimed in claim 7, char- 
acterised by the DNA fragment comprising 

20 the inducible plant promoter and being identical 
with, derived from or comprising 5' flanking regions 
of the Lba gene with the sequence 

GAGATACATT ATAATAATCT CTCTAGTGTC TATTTATTAT TTTATCTGGT 
- GATATATACC TTCTCGTATA CTGTTATTTT TTCAATCTTG TAGATTTACT 

25 TCTTTTATTT TTATAAAAAA GACTTTATTT TTTTAAAAAA AATAAAGTGA 
ATTTTGAAAA CATGCTCTTT GACAATTTTC TGTTTCCTTT TTCATC.^TTG 
GGTTAAATCT CATAGTGCCT CTATTCAATA ATTTGGGCTC AATTTAATTA 
GTAGAGTCTA CATAAAATTT ACCTTAATAG TAG AG A AT AG AGAGTCTTGG 
AAAGTTGGTT TTTCTCGAGG AAGAAAGGAA ATGTTAAAAA CTGTGATATT 
TTTTTTTTGG ATTAATAGTT ATGTTTATAT GAAAACTGAA AATAAATAAA 
CTAACCATAT TAAATTTAGA ACAACACTTC AATTATTTTT TTAATT7GAT 
TAATTAAAAA ATTATTTGAT TAAATTTTTT AAAAGATCGT TGTTTCTTCT 
TCATCATGCT GATTGACACC CTCCACAAGC CAAG AGAAAC ACATAAGCTT 

30 TGGTTTTCTC ACTCTCCAAG CCCTC TATAT AAACAAATAT TGGAG.TGAAG 
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TTGTTGCATA ACTTGCATCG AACAATTAAT AGAAA' 
A AAGA AATA7 G • 




A metzliod as claimed In claim 7, 



c h a r - 



acl:erlsed by t:he DNA fragment: comprising 
5 the Inducible plant promoter and being identical 
with, derived from or comprising 5' flanking regions 



TTCTCTTAAT ACAATGGAGT TTTTGTTGAA CATACATACA TTTAAAAAAA 
AATCTCTAGT GTCTATTTAC CCGGTGAGAA GCCTTCTCGT GTTTTACACA 
10 CTTTAATATT ATTATATCCT CAACCCCACA AAAAAGAATA CTGTTATATC 
TTTCCAAACC TGTAGATTTA TTTATTTATT TATTTATTTT TACAAAGGAG 
ACTTCAGAAA AGTAATTACA TAAAGATAGT GAACATCATT TTATTTATTA 
TAATAAACTT TAAAATCAAA CTTTTTTATA TTTTTTGTTA CCCTTTTCAT 
TATTGGGTGA AATCTCATAG TGAAGCCATT AAATAATTTG GGCTCAAGTT 
TTATTAGTAA AGTCTGCATG AAATTTAACT TAACAATAGA GAGAGTTTTC 
GAAAGGGAGC GAATGTTAAA AAGTGTGATA TTATATTTTA TTTCGATTAA 
TAATTATGTT TACATGAAAA CATACAAAAA AATACTTTTA AATTCAGAAT 
AATACTTAAA ATATTTATTT GCTTAATTGA TTAACTGAAA ATTATTTGAT 
TAGGATTTTG AAAAGATCAT TGGCTCTTCG TCATGCCGAT TGACACCCTC 
CACAAGCCAA GAGAAACTTA AGTTGTAAAC TTTCTCACTC CAAGCCTTCT 
ATATAAACAT GTATTGGATG TGAAGTTATT GCATAACTTG CATTGAACAA 
TAGAAAATAA CAAAAAAAAG TAAAAAAGTA GAAAAGAAAT ATG, 



20 ^0 . A method as claimed in claim 7, char- 
acterised by the DNA fragment comprising 
the inducible plant promoter and being identical 
with, derived from or comprising 5' flanking regions 
of the Lbc2 gene with the sequence: 

25 TCGAGTTTTi: . ACTGAACATA CATTTATTAA AAAAAACTCT CT7^GTGTCCA 
TTTATTCGGC GAGAAGCCTT CTCGTGCTTT ACACACTTTA ATATTATTAT 
ATCCCCACCC CCACCAAAAA AAAAAAAACT GTTATATCTT TCCAGTACAT 
TTATTTCTTA TTTTTACAAA GGAAACTTCA CGAAAGTAAT TACAAAAAAG 
ATAGTGAACA TCATTTTTTT AGTTAAGATG AATTTTAAAA TCACACTTTT 
TTATATTTTT TTGTTACCCT TTTCATTATT GGGTGAAATC TCATAGTGAA 
ACTATTAAAT AGTTTGGGCT CAAGTTTTAT TAGTAAAGTC TGCATGAAAT 
TTAACTTAAT AATAGAGAGA GTTTTGGAAA GGTAACGAAT GTTAGAAAGT 

JOgTGATATTAT TATAGTTTTA TTTAGATTAA TAATTATGTT TACATGAAAA 
TTGACAATTT ATTTTTAAAA TTCAGAGTAA TACTTAAATT ACTTATTTAC 
TTTAAGATTT TGAAAAGATC ATTTGGCTCT TCATCATGCC GATTGACACC 
CTCCACAAGC CAAGA GAAAC TTAAGTTGTA ATTTTTCTAA CTCCAAGCCT 
T CTATATAAA CACGTATTGG ATGTGAAGTT GTTGCATAAC TTGCATTGAA 
CAATAGAAAT AACAACAAAG AAAATAAGTG AAAAAAGAAA TATG, 



of the Lbc]|^ gene with 



the sequence: 
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11. A method as claimed in claim 7, char- 
acterised, by the DNA fragment compris.ing 
the inducible plant promoter and being identical 
with, derived from or comprising 5' flanking regions 

5 of the Lbc3 gene with the sequence: 

TATGAAGATT AAAAAATACA CTCATATATA TGCCATAAGA ACCAACAAAA 
GTACTATTTA AGAAAAGAAA AAAAAAACCT GCTACATAAT TTCCAATCTT 
GTAGATTTAT TTCTTTTATT TTTATAAAGG AGAGTTAAAA AAATTACAAA 
ATAAAAATAG TGAACATCGT CTAAGCATTT TTATATAAGA TGAATTTTAA 

lO AAATATAATT TTTTTG^CTA AATCGTATGT ATCTTGTCTT AGAGCCATTT 
TTGTTTAAAT TGGATAAGAT CACACTATAA AGTTCTTCCT CCGAGTTTGA 
TATAAAAAAA ATTGTTTCCC TTTTGATTAT TGGATAAAAT CTCGTAGTGA 
CATTATATTA AAAAAATTAG GGCTCAATTT TTATTAGTAT AGTTTGCATA 
AATTTTAACT TAAAAATAGA GAAAATCTGG AAAAGGGACT GTTAAAAAG7 
GTGATATTAG AAATTTGTCG GATATATTAA TATTTTATTT TATATGGAAA 
CTAAAAAAAT ATATATTAAA ATTTTAAATT CAGAATAATA CTTAAATTAT 
TTATTTACTG AAAATGAGTT GATTTAAGTT TTTGAAAAGA TGATTGTCTC 

^ TTCACCATAC CAATTGATCA CCCTCCTCCA ACAAGCCAAG AGAGACATAA 
GTTTTATTAG TTATTCTGAT CACTCTTCAA GCCTTC TATA TAAATAAGTA 
TTGGATGTGA AGTTGTTGCA TAACTTGCAT TGAACAATTA - ATAGAAATAA 
CAGAAAAGTA GAAAAGAAAT ATG* 

12. A method as claimed in claim 7, c h a r a c- 
2Qt:Brts&d by the DNA fragment comprising the 

inducible plant promoter and being identical with, 
derived from or comprising 5' flanking regions of 
the Lbc3 -5 ' -3 ' -CAT gene with the sequence: 

TATGAAGATT AAAAAATACA CTCATATATA TGCCATAAGA ACCAACAAAA 
25 GTACTATTTA AGAAAAGAAA , AAAAAAACCT GCTACATAAT TTCCAATCTT 
GTAGATTTAT TTCTTTTATT TTTATAAAGG AGAGTTAAAA AAATTACAAA 
ATAAAAATAG TGAACATCGT CTAAGCATTT TTATATAAGA TGAATTTTAA 
AAATATAATT TTTTTGTCTA AATCGTATGT ATCTTGTCTT AGAGCCATTT 
TTGTTTAAAT TGGATAAGAT CACACTATAA AGTTCTTCCT CCGAGTTTGA 
TATAAAAAAA ATTGTTTCCC TTTTGATTAT TGGATAAAAT CTCGTAGTGA 
3^ CATTATATTA AAAAAATTAG GGCTCAATTT TTATTAGTAT AGTTTGCATA 
AATTTTAACT TAAAAATAGA GAAAATCTGG AAAAGGGACT GTTAAAAAGT 
GTGATATTAG AAATTTGTCG GATATATTAA TATTTTATTT TATATGGAAA 
CTAAAAAAAT ATATATTAAA ATTTTAAATT CAGAATAATA CTTAAATTAT 
TTATTTACTG AAAATGAGTT GATTTAA.GTT TTTGAAAAGA TGATTGTCTC 
TTCACCATAC CAATTGATCA CCCTCCTCCA ACAAGCCAAG AGAGACATAA 
GTTTTATTAG TTATTCTGAT CACTCTTCAA GCCTTCTATA TAAATAAG TA 

TGG4-iXi^xKjr% Avj jk & w • T.-vAC ^W^-lJL 

CAGAAAAGTA GAATTCTSlAA ATG 
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13. A method as claimed in claim 5, charac- 
terised by the DNA fragment comprising the 
inducible plant promoter and being identical with, 
derived from or comprising 5' flanking regions of 
5 the N23 gene with the sequence: 



in 20 30 40 so 60 70 
GAACTCGAGCraXXZCGGGGftTCGATCCTCTAGASSS^^ 
SalX 



IXO 



120_ 



rATTTGAXCCAA^ 



10 AJUlTTTAAlGCTTTAGftTCATGATG^ 



220 230 240 250 260 270 280 

ATC&ATGCrATGATATTGATGGtCTTG&T^ 



290 30O 310 

AGIAGTTAGCaCACCAATAGAAGXAXTGAGXTArATTJ 



^AAAACTTTAGATOCTTTTCAAATGnTACATTS 



360 370 380 390 400 410 420 

15 caTATAGAMrTTTATTGACAATCCTTMAACAGra^ 

430 440 450 460 470 480 490 

ACTTAAATCATAICIAAAATCAACAArGOTACAAGAEAGATTG&ATGAGTTAGTTA^^ 

500 510 520 530 540 550 560 

AGTAAAGTGTTAGAArrGXTTGaTTATAAAACra 

570 580 590 600 610 620 630 

2Q TAATATAAAAASrCGATATTTTAtArAArATATTAaGTCTCT^^ 

640 650 6eO 670 680 690 70O 

AAATAAIMAATAAAGCAACTCanCAAWTTAATGAAACR 

n\Q 720 730 740 750 760 770 

AAAWWTTMSGCMGAXGGAIMSTTm 

780 790 800 810 820 830 ^ASSL 

25 TATCATTTATATGTTQTAAAarATGiaiTGCACTAGTAATTAGTTTW 

850 860 870 880 890 90O 910 

ATTTCTGTCTCTTGGCJUlCrCGTGAGAATTGIUlTATATTAarAAAGAra 

920 930 940 950 960 970 980 

AGAATAAATATTTAIATACAATTCCTAGATTTTGTXATAAAATTCftCW 

lOSO 



ATTAATG 
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14. A method as claimed in any of the claims 
1-13, characterised by the 3' flanking 
region of the genes to be expressed being a 3' 
flanking region of root nodule - specif ic genes of 

5 any origin. 

15. A method as claimed in claim 14, char- 
acterised by the 3' flanking region being 
of leghemoglobin genes. 

16. A method as claimed in claim 14, c h a r - 
lOacterised by the 3' flanking region being 

of soybean leghemoglobin genes. 

L7 . A method as claimed in claim 16, char- 
acterised by the 3' flanking region being 
of the Lba, Lbc^^, Lbc2 or Lbc3 gene with the fol- 
15 lowing sequences, respectively: 



Lb a 



1S90 1620 
TAA TTA GTA TCT ATT GCA GTA AAG TGT AAT AAA TAA ATC TTC 

1650 

20 *TT CAC TAT AAA ACT TGT TAC TAT TAG ACA AGC GCC TGA TAC AAA ATC TTG GTT AAA ATA 

17X0 I'^^O 
ATG GAA TTA TAT AGT ATT GGA TAA AAA TCT TAA GGT TAA TAT TCT ATA TTT GCG TAG GTT 

1770 

TAT GCT TGT GAA TCA TTA TCG GTA TTT TTT TTC CTT TCT GAT AAT TAA TCG GTA AAT TA 

1830 I860 
25 ACA AAT AAG TTC AAA ATG ATT TAT ATG TTT CAA AAT TAT TTT AAC AGC AGG TAA AAT GTT 



ATT TGG TAC GAA AGC TAA TTC GTC GA 
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1320 

TaUk/n AGG ATC TAC TGC AXT GCC GTA 



X350 1380 
MG T6T AAT AAA TAA ATC TTG TTT CAA CTA AAA CTT GTT AST AAA CAA GTT CCC TAT ATA 

1410 1440 
AAT GTT GTT TAA AAT AAG TAA ATT TCA TTG TAT TGG ATA AAC ACT TTT AA6 TTA TAT ATT 

1470 1500 
TCC ATA TAT TTA CGT TTG T6A ATC ATA ATC GAT ACT TTA TAA AAA TAA ATT CCA AAT AAT 



TTA TAC GTT TTA AAA ATT ATT TT 



LbC' 



TAG/GAT CTA CTA TTG CCG TCA ACT 

1X40 

GTA ATA AAT AAA TTT TOT TTC ACT AAA ACT TGT TAT TAA ACA AGT CCC CGA TAT ATA AAT 

1170 1200 

GTT GGT TAA AAT AAG TAA ATT ATA CG6 TAT T6A TAA ACA ATC TTA AGT TTT ATA TAT AGT 

1230 1260 

TCC ATA TAC TAA AGT TTG TGA ATC ATA ATC GA 

1290 



and Lb c 3 



TAG/GAT CTA CAA TTG CCS TAA AGT GTA ATA AAT AAA 
990 1020 

TAT TAT TTC ACT AAA ACS TGT TAT TAA ACC AAG TTC TCG ATA TAA ATG TTG GTT AAA CTA 

lOSO 1080 

AGT AAA TTA TAT GGT ATT G6A TAA ACA ATC TTA AGC TT 

1110 



18 . A method as claimed in claim 1 of preparing 
a polypeptide by introducing into a cell of a plant, 
a part of a plant or a plant cell culture a recombi- 
nant plasmid, characterised by using 
as the recombinant plasmid a plasmid comprising an 
inducible plant promoter (as defined) f root nod- 
ule-specific genes. 
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19. A DNA fragment comprising an inducible plant 
promoter (as defined) to be used when carrying out 
the method as claimed in claims 1-18, char- 
acterised by being identical with, de- 

5 rived from or comprising a 5' flanking region of 
root nodule-specific genes of any origin. 

20. A DNA fragment as claimed in claim 19 , 
characterised by being identical with, 
derived from or comprising a 5' flanking region of 

10 plant leghemoglobin genes. 

21. A DNA fragment as claimed in claim 20, 
characterised by being identical with, 
derived from or comprising a 5' flanking region of 
soybean leghemoglobin genes. 

15 22. A DNA fragment as claimed in claim 21, 

characterised by being identical with, 
derived from or comprising a 5' flanking region of 
the Lba gene with the sequence: 



GAGATACATT ATAATAATCT CTCTAGTGTC TATTTATTAT TTTATCTGGT 
20 GATATATACC TTCTCGTATA CTGTTATTTT TTCAATCTTG TAGATTTACT 
TCTTTTATTT TTATAAAAAA GACTTTATTT TTTTAAAAAA AATAAAGTGA 
ATTTTGAAAA CATGCTCTTT. GACAATTTTC TGTTTCCTTT TTCATCATTG 
GGTTAAATCT CATAGTGCCT CTATTCAATA ATTTGGGCTC AATTTAATTA 
GTAGAGTCTA CATAAAATTT ACCTTAATAG TAGAGAATAG AGAGTCTTGG 
AAAGTTGGTT TTTCTCGAGG AAGAAAGGAA ATGTTAAAAA CTGTGATATT 
TTTTTTTTGG ATTAATAGTT ATGTTTATAT GAAAACTGAA AATAAATAAA 
25 CTAACCATAT TAAATTTAGA ACAACACTTC AATTATTTTT TTAATTTGAT 
TAATTAAAAA ATTATTTGAT TAAATTTTTT AAAAGATCGT TGTTTCTTCT 
TCATCATGCT GATTGACACC CTCCACAAGC CAAGAGAAAC ACATAAGCTT 
TGGTTTTCTC ACTCTCCAAG CCCTC TAf AT AAACAAATAT TGGAGTGAAG 
TTGTTGCATA ACTTGCATCG AACAATTAAT AGAAATAACA GAAAATTAAA 
AAAGAAATAT G, 
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23 • A DNA fragment as claimed in claim 21, 
characterised by being identical with, 
derived from or comprising a 5' flanking region of 
the Lbc^ gene with the sequence: 

TTCTCTTAAT ACAATGGAGT TTTTGTTGAA CATACATACA TTTAAAAAAA 
5AATCTCTAGT GTCTATTTAC CCGGTGAGAA GCCTTCTCGT GTTTTACACA 
CTTTAATATT ATTATATCCT CAACCCCACA AAAAAGAATA CTGTTATATC 
TTTCCAAACC TGTAGATTTA TTTATTTATT TATTTATTTT TAG A A AG GAG 
ACTTCAGAAA AGTAATTACA TAAAGATAGT GAACATCATT TTATTTATTA 
TAATAAACTT TAAAATCAAA CTTTTTTATA TTTTTTGTTA CCCTTTTCAT 
TATTGGGTGA AATCTCATAG TGAAGCCATT AAATAATTTG GGCTCAAGTT 
TTATTAGTAA AGTCTGCATG AAATTTAACT TAACAATAGA GAGAGTTTTC 
lOGAAAGGGAGC GAATGTTAAA AAGTGTGATA TTATATTTTA TTTCGATTAA 
TAATTATGTT TACATGAAAA CATACAAAAA AATACTTTTA AATTCAGAAT 
AATACTTAAA ATATTTATTT GCTTAATTGA TTAACTGAAA ATTATTTGAT 
TAGGATTTTG AAAAGATCAT TGGCTCTTCG TCATGCCGAT TGACACCCTC 
CACAAG CCAA GAGAAACTTA AGTTGTAAAC TTTCTCACTC CAAGCCTTCT 
ATATAAA CAT GTATTGGATG TGAAGTTATT GCATAACTTG CATTGAACAA 
TAGAAAATAA CAAAAAAAAG TAAAAAAGTA GAAAAGAAAT ATG, 



15 24. A DNA fragment as claimed in claim 21, 

characterised by being identical with, 
derived from or comprising a 5' flanking region of 
the Lbc2 gene with the sequence: 



TCGAGTTTTT 
TTTATTCGGC 

20 ATCCCCACCC 
TTATTTCTTA 
ATAGTGAACA 
TTATATTTTT 
ACTATTAAAT 
TTAACTTAAT 
GTGATATTAT 
TTGACAATTT 

25 TTTAAGATTT 
CTCCACAAGC 
TC TATATAAA 
CAATAGAAAT 



ACTGAACATA 
GAGAAGCCTT 
CCACCAAAAA 
TTTTTACAAA 
TCATTTTTTT 
TTGTTACCCT 
AGTTTGGGCT 
AATAGAGAGA 
TATAGTTTTA 
ATTTTTAAAA 
TGAAAAGATC 
CAAGAGAAAC 
CACGTATTGG 
AACAACAAAG 



CATTTATTAA 
CTCGTGCTTT 
AAAAAAAACT 
GGAAACTTCA 
AGTTAAGATG 
TTTCATTATT 
CAAGTTTTAT 
GTTTTGGAAA 
TTTAGATTAA 
TTCAGAGTAA 
ATTTGGCTCT 
TTAAGTTGTA 
ATGTGAAGTT 
AAAATAAGTG 



AAAAAACTCT 
ACACACTTTA 
GTTATATCTT 
CGAAAGTAAT 
AATTTTAAAA 
GGGTGAAATC 
TAGTAAAGTC 
GGTAACGAAT 
TAATTATGTT 
TACTTAAATT 
TCATCATGCC 
ATTTTTCTAA 
GTTGCATAAC 
AAAAAAGAAA 



CTAGTGTCCA 
ATATTATTAT 
TCCAGTACAT 
TACAAAAAAG 
TCACACTTTT 
TCATAGTGAA 
TGCATGAAAT 
GTTAGAAAG7 
TACATGAAAA 
ACTTATTTAC 
GATTGACACC 

X, Va WAX X 

TATG, 



25. A DKA fragment as claimed in claim 21, 
characterised by being identical with, 
30 derived from or comprising a 5' flanking region of 
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the LbC3 gene with the sequence: 

TATGAAGATT AAAAAATACA CTCATATATA TGCCATAAGA ACCAACAAAA 

GTACTATTTA AGAAAAGAAA AAAAAAACCT GCTACATAAT TTCCAATCTT 

GTAGATTTAT TTCTTTTATT TTTATAAAGG AGAGTTAAAA AAATTACAAA 

c ATAAAAATAG TGAACATCGT CTAAGCATTT TTATATAAGA TGAATTTTAA 

AAATATAATT TTTTTGTCTA AATCGTATGT ATCTTGTCTT AGAGCCATTT 

TTGTTTAAAT TGGATAAGAT CACACTATAA AGTTCTTCCT CCGAGTTTGA 

TATAAAAAAA ATTGTTTCCC TTTTGATTAT TGGATAAAAT CTCG7AGTGA 

CATTATATTA AAAAAATTAG GGCTCAATTT TTATTAGTAT AGTTTGCATA 

AATTTTAACT TAAAAATAGA GAAAATCTGG AAAAGGGACT GTTAAAAAGT 

GTGATATTAG AAATTTGTCG GATATATTAA TATTTTATTT TATATGGAAA 

in C^AAAAAAAT ATATATTAAA ATTTTAAATT CAGAATAATA CTTAAATTAT 

TTATTTACTG AAAATGAGTT GATTTAAGTT TTTGAAAAGA TGATTGTCrC 

TTCACCATAC CAATTGATCA CCCTCCTCCA ACAAGCCAAG AGAGACATAA 

Q^m.pm;^TTAG TTATTCTGAT CACTCTTCAA GCCTTCTATA TAAATAA GTA 

TTGGATGTGA AGTTGTTGCA TAACTTGCAT TGAACAATTA ATAGAAATAA 
CAGAAAAGTA GAAAAGAAAT ATG. 



25 26. A DNA fragment as claimed in claim 21, 

characterised by the DNA fragment 
comprising the inducible plant promoter being iden- 
tical with, derived from or compri'sing 5' flanking 
regions of Lbc3 - 5 ' - 3 ' - CAT gene with the sequence; 

20 TATGAAGATT AAAAAATACA CTCATATATA TGCCATAAGA ACCAACAAAA 
GTACTATTTA AGAAAAGAAA AAAAAAACCT GCTACATAAT TTCCAATCTT 
GTAGATTTAT TTCTTTTATT TTTATAAAGG AGAGTTAAAA AAATTACAAA 
ATAAAAATAG TGAACATCGT CTAAGCATTT TTATATAAGA TGAATTTTAA 
AAATATAATT TTTTTGTCTA AATCGTATGT ATCTTGTCTT AGAGCCATTT 
TTGTTTAAAT TGGATAAGAT CACACTATAA AGTTCTTCCT CCGAGTTTGA 
TATAAAAAAA ATTGTTTCCC TTTTGATTAT TGGATAAAAT CTCGTAGTGA 

25 CATTATATTA AAAAAATTAG GGCTCAATTT TTATTAGTAT AGTTTGCATA 
AATTTTAACT TAAAAATAGA GAAAATCTGG AAAAGGGACT GTTAAAAAGT 
GTGATATTAG AAATTTGTCG GATATATTAA TATTTTATTT TATATGGAAA 
CTAAAAAAAT ATATATTAAA ATTTTAAATT CAGAATAATA CTTAAATTAT 
TTATTTACTG AAAATGAGTT GATTTAAGTT TTTGAAAAGA TGATTGTCTC 
TTCACCATAC CAATTGATCA CCCTCCTCCA ACAAGCCAAG AGAGACATAA 
GTTTTATTAG TTATTCTGAT CACTCTTCAA GCCTTCTATA TAAATAAG TA 
TTGGATGTGA AGTTGTTGCA TAACTTGCAT TGAACAATTA ATAGAAATAA 
CAGAAAAGTA GAATTCTAAA ATG 



27. A DNA fragment as claimed in claim 19, 
characterised by being identical with, 
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derived from or comprising 5' flanking regions of 
the N23 gene with the sequence: 

GJATTCGAGCXCGCCCGGeaiir^^ 

BA an LOO llO 120 130 140 

5 OTCtATTG&GACJtfaam^ 



AAATTXi 

220 230 240 250 260 270 280 
ATG&MSKaATCaTMTGAiama^^ 

360 370 380 390 40O 410 420 
CATATAGAATTTTATTGACW^CCTTATJUtfMTT^ 

430 440 450 460 470 480 490 

ACTTAAATCATMCTWkAATCAACAM 

500 510 520 530 540 550 560 
15 AGTAAAGTGTTJUSAATTOCTTGftirTMAAIUU^^ 

570 580 590 600 610 620 ^ 3 0 
TAATATAAAAKETGftEATTTTKC&TJ^ 

fi40 650 660 670 6BO 690 TOO 

710 720 730 740 750 760 770 
20 AAAAATTAATGCTTC»T6<aU«aOTOT 

760 790 800 810 820 830 840 
TJ^CASTTATATGTTGTWJlTMGMttGCJ^^ 

850 860 870 880 890 90 0 ,JSSS* 

ATTTCTGTCTCTTGGCAACTCGTGAGAATTGAAXATATTATAAAGATGAAAGGtCGTTACAATTTTTTTT 

Q20 930 940 950 960 970 980 
25 AKMTAJ^ATTTATATACAArrCCIAG&TTTTGXTATAAAATTC^ 

990 loco lOlO 1020 10 30 1040 1050 
GAiK^CACACCAAACTAGTCTCAAATTAAGTAAGGTGCTAATTATTAGCGGCTAGCTA^ 



ATTAAT6 



28. A plasmid which can be used when carrying 
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out the method as claimed in claims 1-18, 
characterised by comprising a DNA 
fragment as claimed In any of the claims 19-27. 

29. A plasmld as claimed in claim 28, c h a r- 
Sacterlsedby being pAR29 . 

30. A plasmld as claimed in claim 28, c h a r- 
acterisedby being pAR30. 

31. A plasmld as claimed in claim 28, c h a r- 
acterlsedby being pARll. 

10 32. A plasmid as claimed in claim 28, char- 
acterised by being N23-CAT. 

33 . A transf ormant Agrobacterium rhizogenes 15834- 
straln which can be used when carrying out the 
method as claimed in any of the claims 1 to 18, 
IScharacterised by the bacterium strain 
being transformed by a plasmid according to any of 
the preceding claims 28 to 32. 

34. A transformant Agrobac terlum rhizogenes 15834- 
strain which can be used when carrying out the 
20 n^e^^c'^^ claimed in any of the claims 1 to 18 , 

characterised by the bacterium strain 
being transformed by pAR29 and being named AR1127. 

35 . A transformant Agrobac terium r hizogenes 15 8 34- 
strain which can be used when carrying out the 
25 method as claimed in any of the claims 1 to 18 , 

characterised by the bacterium strain 
being transformed by pAR30 and being named AR1134. 



\ 

• \ 

\ 
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36 . A trans formant Agrobac ter lum rhizogeTies 158 34- 
strain which can be used when carrying out the 
method as claimed in any of the claims 1 to 18, 
characterised by the bacterium strain 
5 being transformed by pARll and being named ARIOOO. 

37. A transformant Agrobaeter inm rhlzojgenes 15834^ 
strain which can be used when carrying out the 
method as claimed in any of the claims 1 to 18 , 
characterised by the bacterium strain 

lO being transformed by N23-GAT and being named AR204- 
Er23-CAT. 

38. Plants, parts of plants and plant cells, 
particularly of the family Leguminosae, obtainable 
by transformation' with a recombinant DNA segment, 

3_5 fragment or plasmid according to any one of the 
claims 1 to 37. 



\ 

\ 
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m m w 

:^mmit. sggfc^g^uy^nia^MoA 9fi:^^^^^>^k^'SStt^w^s^>/^• 
Igj^SiSti. {itJ'OXlJeiTX*^. (Santaren. J.P.et al.. Biochim. Biophys. 
Acta, 687:231. 1982)o 

CH3(CH2)« CH = CHCH2 CH=CH (CH2)t COOH 
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■tra-yu (PG) (DJ^-cib^. m'&s<Di&i§L»m(oi&mi><p G<Dmm^i^^ ^ ^ t 

(Murata,N. et al.. Plant Cell Physiol. . 23:1071, 1982; Roughan, P. G. , Plant 
Physiol., 77:740, 1985 ) . i^tzP GCD^'f-mmmmmmz^^-t ^ ^ -iz o 
3 _ ij h5 >X7 x5-"if (JgiTATase) ©SMil^f^t- J: o 

T^J6 t)tLTl^ C <h (Frentzen, M. et al. , Bur. J. Biochem. , 129:629, 1983; 
Murata, N. , Plant Cell Physiol. , 24:81. 1983: Frentzen, M. et al.. Plant Cell 
Physiol., 28:1195,1988) i)<^<7^^^ttX\.^fZo 

ATaseiSe^^ ^ ^< = f-^ A • ^^^t ^ZtlcJ:fOPG (Dtk^^Tm^M^T 

C T#1*FtBil : PCT/JP92/00024, 1992) o L;i)>L. ATase{i5t;0*l^43 ??^E 
L^ {c^3fe©ATase^*ti^4^-e:^SM^$i±yr<i: LTt). rt^Stt^ATase t 

l^o «»J;t{fs f^^LfeJ^SKSffe^^-^^O-^-^i^a^r 3^:^X:^©ATase<£r:S^>:^S 
{Cfg^ LTl.^ S n - >CD|I© P G (Om^^=Pm-^&iim2B%'Z:'^ K> ^y<n^^ 
t)J^8 %iJ>^j:i<^*<. v'O-r *>*^8 (PCT 

#fFai!l : PCT/JP92/00024, 1992) o 

$t>{c:. — 5^ H-e^'^t>nST >'>'l/ -AGP {ii:{C16:0-ACPil8:l-ACP 

(i 16 : 0-ACP^ 18 : 0-ACP©iiJ-^*^ 18 •A-kZ?XK>l^^^Z. th^^^ti^ (Tori yama, S. 
• et al.. Plant Cell Physiol. , 29:615, 1988)o C CD J: 9 ^i:*l^'^{i^3feOATase 

m^<^MmW^m^L'€\^^ ^m^(DmV.m.^tmi&.LT^^ ^ (Murata. N. et 
aL.in^Thc Biochemistry of Plants", Academic Press. 1987) o t.tz^ >MXi-i^ 



2 
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*^An^n^i:v^Anacystis nidulans (BIJ^ Synechococcus PCC 7942)«{£a^^ 
(Ono,T.et al.. Plant Physiol. . 67:176. 1981). 2 -D&.±Atl<^ti^ 
Synechocystis PCC6803 iti&Mm^X'^^ !^ ti)<^h>nx\.^tz (Wada.H.et al. , 
Plant Cell Physiol. , 30:971, 1989) o 

lz^-^LfzmmmK=.m^^^mA-t^o ttfeoT. ^>^«16:0/16:0-*5i:lDf 
18: 0/16:0- ©fiSfQ^^S*^ ^J^SbS® P G . SQDG. MGDG:feJ:a^D 
G D G 0mmmiCcis-m<D=.mm'^'S:mA-t ?> ^ ti<-S!t^-^-^ ^ (Murata, N. et 
al.,in "The Biochemistry of Plants", Academic Press, 1987) o i)^i}^^}^itB 

mmi^^m^tmmtLx. xxro^rywAcp (i8:o-acp) (DAQm^-^m^^ 

^A^^^^^WL> —0.16:0/16:0- d^J^Zf. :b-r*^(w#:S-#-'5 18:0/16:0-) 

Synechocystis PCC6803 © A 12fi!:^tSfn^k^Sjl^£^^Anacystis 
nidulansfw^A • HSg^-lfS C iJ-J: *5*^Anacystis nidulansfcti??:^ Lnc</> 
16:2A9,12*5<fcl>'18:2A9.12^^M$ii:5C<i:*<^t|-r*^x l^^t LXi^^iS. 
a.^Sttt?*SAnacystis niduUns^iS.mM^^tH^^^X$)^ C tt<^-^n 
Tl^'5 (Wada.H.et al. , Nature, 347:200, 1990) o 

ti^^ Cti^XiZ^ ym<D^ikmitmmco o ^ A e & (Reddy. A.S. et 
al., Plant Mo 1. B io 1 . . 27 : 293, 1993) *3 J: ?>* A 12<4 (Wada.H.et 
al..Nature.347:200,1990)^fi&fn^kS^^©jt<K^*-'lX#$nru^^o L*>L. A 
9<4{wZimi^-^*^SA^tiTi^/sttnt^. A 6fi:fc«tOfAi2{i^fi&fo>fl:^^{i. 

^n^-nA etti Al2fi[*^fia^a^kt■i.c<l:^i•^§^i:v^o A9{fi:i:Ai2& 

i!Mft^^©A 9fi[^^l^f□^kt-S^fg©®fe^^i^i^>t^Tl^^^*^oy::o 

tJfeoT. *^B^«s llgffi^KOA 9fe^^tafQ'^k■rSl?^©itfe^*5<fcC;-?:©- 
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±iaafi*J^:aj5fe'^'S^&. Anacystis 5 >^O^V A 

- D N A ^nfz^. ^ :^ - D N A-^ffi^lfSHSa^?^®^!^ 
SCli{c^?&L. 2^^BJ^^^$-lir^{-So^o f^i^^-fe. *^HJ«. OT®*: 

(2) fl§sci^-^L^Jigfl&®f©A gfe^^fis^D-fb-rsstt^wrs^ 

(5) (l)7!7M(3)®l.^^*l*-{-iB«®me^X(i^^it<£^©-S|5^#t;'f^ U ? iJ^ U 

(6) (5){cte«K<^>1t^m^^^b$■^±■r1t^fe^4^<£:^S^^-^i:^c«i:^#mi■rs*t^ 

(7) {i)JbM(3)©V>-rn;6^{ClS«®«fe^X«^^iSfe^^O-S|5^#t;'}5 U 5f ^ U 
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^\mt. des 9 var ^Wii^^- V't^TX ^m^nt-^^ 7.(D7.^r u ^ J\y 

-Co K^m\\tmM (MS CD 2) (DT ijnwm(^^m^yr^-to m^-^nm 

^2EI«s des 9 var m¥(^-:^ ^^'^^ Anacystis nidulans A 

^4Ei{i. des 9 ^\iit^tf7.(D7.^y^^)^y-^oKX^m^\\Lmn (msc 

(Kaestner.K.H.et al. . J. B i o 1. Chem. , 264 : 14755. 1989) . ^ h ( 
Mihara.K.. J. Biochem., 108:1022. 1990) SlO^S* ( S tukey. J. E. e t 
al., J. Biol. Chein.. 265:20144. 1990 ) CDT^xT d ^ vU- C o A^t&^n^k®^^©^t 

Bgg{C*g-^Lytl!&0K® A 6 &*5J:a'A12fi:cD:FI&fD^b^#S.?>'«^*i^®flBK 
{cj^^ Ly::l!§ft&K® 3 tt®^t&?Q^biS^ (Yadav, N. S. et al., Plant 
Phyliol.. 103:467. 1993) ©^t^fllit t < L-Cl^^.£ l.^o ^^H^itiS^^ 
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^n-S 5 >^{*1tf-IS^^ tl-f . •eaj;t{fAnacystisM. SynechocystisM^ 

i^^litlOfiSfO^^S^^fiSfn'fb-f SfcJi>{c«AnacystisM (Murata,N. et al., 
Plant Cell Physiol.. 33: 933, 1992 » C(DXmX'm ^J^-^ 1 m<D3 
^) CD A 9 fe:fia5fO'fbffi^#®:35'*^AnabaenaMB:C)fSynechocystisM©S^fil«fc t> 

t-)^j:*:>'^s Synechocystis PCC6803 iAnabaena variabil ist'fi^ -^r^D^flgK 

®! (C16) ^^-^LTl/^S (Sato.N.et al. , Biochim. Biophys. Acta, 710: 
279,1982; Wada, H. et al., Plant Cell Physiol. , 30:971. 1989 ) (DizS^LX. 
Anacystis nidulans-ett (J t A/ *<sn-l <J: sn-2<J; C16^*g^ LXl^ 5 ( 
Bishop, D. G. et al., Plant Ceil Physiol. , 27: 1593. 1986) o 'i/t-oX^ Anabaena 
i Synechocystis® A 9 ^:^muitmmt^lZl8:0/16:0-<D^^m^^nt LT 
sn-l©18:0^18:lA 9 {Ci^^lgfO-fb^SSI^^W-r S tMiotl^o Ctli^MLX 
Anacystis® A 9 &^ia5fD^b®i^ti^tCl6:0/16:0-®^^®<£r»S i LTsn-1® 
16:0^16:1A 9 {c:ft&3fn^ht-S?g14^W-^5 tSt>nSo $^{C. ii5^*t<^{C# 
< m ^ n S ISfD!»-i=^a*n6 : 0/ie : 0-T* S C <h t> . i^^*!*^® iSfO^^SI^^ 
ISfn-fb^ ;E) J6 {c (SAnacys t i sM® A 9 &:^^^^tM^<D r> *<Anabaena*5 «fc 
SynechocystisM®^^<k ^jS-^'^fe'So 

y^T i ySgfie^J^Wt-S A 9fi::?^ta?Q'fbS^^^=i- K-T'Sfc®^^'^. *S«=i K 
>{Cfe^V^T®^M^j:-5"Cl.^Tll— ®'i? U '^y^ K^=i- K^SCi®-r^5^iS 
Mtt#:^^t;*>®'r*5o *f&HJitfe^{i> i:{cDNAiA<i: LX®af*:fi1l?Bfii<£: 

-1-4 {cie«g$n^T i y^Be^'Jfcjn^T. a Q&^ma^bffi^^stt^w^ad^^' 

T ^ cfc l^T i y KBE^J^^ti- ©"^^ o 
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m^<DMn<2:^<^^^i^^^'^^ ^DNA^mm l> y a d n a l- 
xMnm\'^ ihti?> i><D^mif ^ tii)<T-^ mw^i^it. a dash ii 

(Stratagene)^0:7 r-i^ ; pWB15 (Stratagene)^© n x ^ K ; pBluescript 
II(Stratagene)^©:7r-i^^ Vm^^f ^ ti}<X'^ ±ie-^i>i5^ ^(D^ 

^JCD— (^Jx.{f. :^ 1 |g!CDMSCD2CDT ^ y ^i^?lJ#-^260*^C>295©— SP^^ 

■ d CD ck -9 (C LTSfe L y-c a - > {-*5 ^:^ 3^^B^it<K^©^Sie^iJcDgfe^2S^C/ 

- hj* (Maxam-Gilbert, Methods Bnzymol. , 65:499, 1980)-^M13 V 7 — 

^ Ki^S^i^SCMessing.J. et al. . Gene. 19:269. 1982) ^ 

(J.Bacteriol., 175:8056, 1993 ) lZ'^^TrfoCtt<X^^o 
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genetic transformation and gene expression: a laboratory manual". Draper. 
J. et al.eds..Blackwell Scientific Publ ications, 1988J mm<D:f5^^^^^X 
'rro^tt^x^^o ^(Dmt\.Xkt. ^ J^^^m^'^^:^^-^ 
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j*(Nature, 287 (1980), p. 654: Cell. 32 (1983) p. 1033; EMBO J.. 3 (1984) 
P. 1525) s T-DNA±©ffi^mitfe?«Sl^^^«$'<i-^^i^^-^*'iffl^^ 
^^^^ ^-mEmO J., 2 (1983) P. 2143: Bio/Technology. 3 (1985) p. 629). 

^_jS(Bio/Technology, 1 (1983) p.262; Nature. 303 (1983) 
P.179: Nucl. Acids Res. . 12 (1984) p. BTlD^Xi'*^* 0 . cnCOVm®:^ 

n 5 mm--^^^ K^WL<om%.^^m.mw^^mt ^ c i *<t' ^ s i ^ .-^ 

(^Jx.{^-ilFB^) iSST («?>J>Liir4'C) T'^^L. ffi'^-^O^W. 

CIIM^jn Anabaena variabilis® A 12{4^fiafn'(bS5«^e^F (desA) <0±M 

Anabaena variabilis I AM M- 3 (*«;^^^^M^^^W^m«}: ^ 
^) j^lOOmlOB G-lli^itfe ("Plant Molecular Biology". Shaw, C. H. ed. , p. 
279. IRL PRESS, 1988) T'^«L/;:o 25°Cs 1, OOOluxO^TfeiTT-^lf 53-120151^ t 
•9 Ls ^t^^-m^^W^-lir^o :^«f«^^a-e5.000_g-eiO^FB^iSiilN53'lli^SC<i: 
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^ J A^mm^ ^tzif>. mW'S:50ml(D Am (50mM Tris-HCl. ImM 

BDTA.PH8.0) izmmtxmf^L. Mst^^mt^^tKj^^mi^^^^^tLxm 

jRL/^io ^i^^ 15ml® (50mM Tris-HCl, 20mM EDTA. SOmM NaCl.O. 25 M 
sucrose. pH8. 0) tzmML. BmxmmLfzAOmgO -l^ (Sigma) ^Ha^ 

3TCX 1 mmmt -5 Lfzo e^jcy nf^Th— bf K^ismgi s D s ^mmmx' 1 %C 

XolZlM^. $ ^>{C20mlOi? nn;^;l//./'1' VT ^ >'UT>'l'=i-;U (24:1) ^*n;i 

5 yl/T/l/3--'l' (24: l){w J; «3 1?** ^ L/^.^. 7Kii{-50ml® J^:^ ^ -^U^ADX.. 

ADNAPSl«!l^/f^:^#{-##f*«:^TlHlJRLy::o CCDDN APiSi^^20ffil 
©A?g{-^:6>L. NaCl^i^iliE"eO. IMJCL. $ t> {CRNas 6^*^26 igp-e 5 Omg/mUC 

©7 -;i/-t?2 IDtttU L^m> TRScj^oyv ADN A^Ji^J' y -;u^*nx.S c 
iJCj: ^J^tl^^^t LT@iRLx 70%:i^:J^ y 1 ml® A?i£JC^*> L 
Anabaena variabilis ©y y ADN A^^i U^o 

S2^e,«Anabaena variabilis ft^j^O/gllgKJCi^g^ Ly=l!gS^ife© A12{a:^fia^P 

m^M. No.3aF04) LA:I^. Al2&.:^mmitmmiABi=F(0±mizm^LX^-':f 
y ,j -5?^ u-A (ORF) *<?¥ffiL. cn*^':7^t&fa'fl:^^<i:^t>*^oill 

ie^JtcaSLT. 4 (ie^iJ#-^ 5 -Bfi^J^-^ 8 ) <£:^fiScL> 

Anabaena variabilis©^ y ADNA^^Mi: LTP C R^lf ^.i o 

o^^io S^SJi^ 100^/ l©SJtv?S4' ^ I" ^ - M. Anabaena 
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variabilisCD^y ADNA^ 1 jt/gAtl. GeneAmp PGR Kit (^Jlit) ^M^^ 

rnf^-otzo ^m^o^mumii. 95°c (155-) . 45T (155-) . 72'c (2^) ^ 

190bp) CDDN A»fn-*^'i^^<i^-^'> H'<i^ CODNA C&.r. 

des 9 var Kf^T-coM^I^SS^Klenow^ 5 ^> > h 'fb L^^. ^t'^:^ 

$ KpTZlSR (Pharmacia)©Sina I a5ti:{C^ V^L. ^5tD N A -tr 

_ (Applied Biosystems) ^ffll^T4gSIE?(I^^^ L^o #t>n^:^S@2^J^Se 

1 iZTjk-to z(Dmmmni}'^h>m^^n?>T ^ jmmy^i (i2?ij#-f-2 ) 

^«^;;?.<D7.7^TD>r >'l/-CoA^fi&fD>fb^^<J:W^^i:tg|nl'tt^^L/c 1 E : des 
9 var ^}iri)<=i-V-t^riJmWint^^:^<DX7'Tu-(jV-CoATmmit 
MM (MS CD 2) ©T i y ®?l£2^J®Jtt^^7i^^] o 

des 9 var lSi)n^':f^ — 'ftLX^ Anacystis nidulansO^ ^ AD N A 

mo.lfig ©Anacystis nidulansCDy ^ A D N A ^"KlSr L . 0.8%T75fn-x 
^ jVn,^'^mXDN Am)^^ii^M'^. '^-(a>;t>zru> (Hybond-N*; 
Amersham) IzzTa y 7^ ^ > ■^^LtZo ^n-^DN A«Multiprime DNA labelling 
KitCAmersham) ^M^^X ia-''P} dCTPxmW^LtZo 6xSSPB[lx 
SSPElilOmMU >mmmm CpHT.O) . ImM EDTA, 0. 15M NaCl], 0. 2% S D S *5 «fc 
Xfl00fis/ml^'>>m=f-'DNAi)^i^^^m^'C55V. imfS^ V^a^--> a > 
l^-^^n-:/DNA<i:P<>:^^>^K*6^"«±-^o -^rO^. ^ >zr U 2 XSSC 
C 1 xSSCtiO. ISM NaCl. 15mM^ h U ^7 A] fpX^M.. 15^^2 13. '{k 

V^-Z:•0. lxSSC4J-e40"'C. 15:^3-^ 2 IhIM t o LT^i.^. ;t - h 5 ^ ^ "7 ^ - 

*-*<:^ltl^n^ (^2^:S4'. N o ntiyy ADN A^$iJPSl^^-el2JirLXl-^ 

CIIJfe*^J2) des 9 var Kf^tt^BlSHtOi^^^Anacystis nidulans^ ^ A^i^D N 
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Anacystis nidulans R2-SPC (m^:^^^^fflllS^^W35BfTct «9 (D^ 
-^D N A©19S8ti^ Anabaena variabilis <DJ#-^i:|iI^{C?T^<£ o /Co 
j^lOO/zg©'b'y ADN A^Sau3AI-Z?ia5^?^^hL/i^x Molecular Cloning 2nd 
edition,pp. 2.85-2. 87(Sambrook. J. et al. eds.,Cold Spring Harbor Laboratory, 
1989) CD^j*{Cite-oT. i/ H«i^^^i2TT©j@Sii>!5^1ll{C J: «9*«7 9 *^ t> 
23kbp©DNA»f>T'^lHliRL^Co cn^BamHI iHindl I IT-ej^ L ^ ^ A ^ :7 r 

— v?'^^ A DASH ll(Stratagene) Ki7 u——:^ VLfz^^^ y r-i^&iFl^^^ 
y y^i^i/^LAnacystis nidulanscoy / AD N A 5 -< U — ^^i:fco Cl©:7 

j^^^y^ i; -;g.:^l»MP2392 JCSSfe$-ar> N Z YMi^Jfi^Atl^ltS^^ 
IScmCD -1- - UKt\.^XmMm075m(DZf3 - ^J^fife^ ■frfc^> t^-T P >^ >' 
:/L/> (Hybond-N-": Amersham) {C^u -y -r >f L^o ±IHcDiMf V^tffi 
mW^. Ca-'*P] d CTP-e^liL/cdes 9 var Kfit^C©^ VT'U^iS 

^#/co c:®4'*^t.tt3e{C12^'a->^Saf. '^^JC$toT7 r-i^DNA<£-# 
lie»tl^7T-i^DNA^i!t«S©$iJIS^*t?iiO»fL. 0.8%T:</n-xy 
J\ymn.-^mX-ii^m^. -^^ u:y/ >:/U>tz:/a y^r^OirLtZo <l(D;i>ZfU 
>^±ieox^ U --ViJ^'iUl^^l^^T-lMf V^^-^f L. y^-yDNA<i:/^'r'/ 

i A15©2 ^ u->3&<#t?S^»^>'^:^-'^^^L^ ^/::'r hD NAif^t©^ 

$ fc^n^'nil*J<ta^l5kbp"e&o/iy;-feg6*I©ORF:i:{4^^^trO{c:+55-iJiiiJ 

»fL> CI© 2 ^cj->®>i' h DN Aiz^^wiz^-oTb^ommnm-^^^m L 

- > i fcJfe^ 5 kbp® D N A»fK-*^'^til$ tL^©-e. C tL^pBluescri pt SK- 
(Stratagene) <DXho Iif■1'h{C1^-y:^i'^-^>:rL^ X 5 t X 15S3fe®DN A 

»f>T-^-€-ti-en#t;7'^;^ i Kp 5 xipisx^^fco p 5 x<b pi5xo#*Hni 

Wf$n/^ C^3EI: A 5. A15*5J:0*pl5X©'f hDNA»f>T-Offl5g9#> 
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;{,ON>fyUir>fXLfcDNA»f;t^^^« 1.^^91 tides 9 nid <D^m 
t-byX0©;^lSl^^^o ffll^3^«3«des 9 nid :S:^t;^«©'>-^ 
;&^fo 5.1. 25*5^:0^0. 5kbp©#/<-«^®^SJ^*5«:»-^-^^^^-^-^'^'^'' 
$mii^OIB&-^{i^ B, BamHI :H. Hindi 1 1 ;N, NotI :Hp. Hpal ;RI. EcoRI :RV, EcoRV; 
S,SalI;P.PstI;X. Xhol^^^l o 

t)f^^L. des 9 var BtK-*<-^ ^ 'J ^i' ^^-^ ^^^^^^^^ 2 kbp©D N 
^®DNAWf>^-4»{C«834bpA>e>^j:50RF (des 9 nid) iJ<^^L (Be^J#^ 

3) . 2iBmm(Dr=^jmt<:^-\'^tix\,^?>tm^^nt:: imnm^Ay o ^ 

{c^n-->^X/cAnabaena variabilis 45fe©des 9 var ^f)!*^^- K LTV^ 

afTiyffiJSE^J®ft?«TV7 h (GENETYX : y7 h'i7iTMM) 
y^ge^j(Z>7^-^r.<-7. (EMBL*3J:C;dDB J) 

^,^T5yl?ge^J^D«^5fe*^T^i:o^<^:C^. v^>^©x-rTn>f/i.-CoA:f^l&fa 

'(k^#<i:0;|il5lte7i<^'f*:-e(i*^3 0 ^-^fe 5*<^mfi«J{w*SJwi^l^ C «h CIS 4 
^ : des 9 nidt^'^X©XxTD^;U-CoA:?;|&^^bl^^ (MSCD2) © 
Tl^Wtmncoitm m?tL/cdes 9 nidiiliiflS^^:^^tafn^b-f 
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Anacystis nMixlinsl.i:^mm^mmtL'Cmni^m^Lfz^mmm(D 9 ^'S: 
T^mmt't^^^i^^^^^^^^^^^*"^'^^^^' (Bishop. D.G.et al., Plant 
Cell Physiol.. 27:1593. 1986) fcJ6. des 9 nid*^ = - K^T S U 

^Ltzo fiP-^. iJ'-i L-CpET3a(Novagen)ffllK -^©Ndel iBamHIOPyHw 
des9nid^Ti y*iS{C^tt^i:T^ y^^ftltnCi'^fflt^LT^'a-^y^-t-^ 
C i:^J£lT©«k-?J-«-'^fi'^«i:-3^o des9nid©=J- FfS^ ^ © C ^^{M 
ii:^{CBainHI1^-f h ^Ans/::i6{::> C5lS^^i*tr 2 ttBFr^mSSe^J^^oX P 

•tr>X>^^l'-7-:5' -ACGTCATGGCCTGCAGT (Ti6^ «Ps 1 1 i^- ^ h ) (i5?iJ#-t 9 ) 
T>f--tr>'7.:7'^^-^-;5' -CGCGGATCCTTAGTTGTTTGGAGACG ( 1 fi^«BamHI1^ 

4 0 b p©^^*^li'5»tLx c:n*pUC190Smaia5{i{C-y-:7'^ n-— ^'i/ LX^S 
gE^Jfcr^^l^O^XV^Ci^iS^^Lifeo Z(D^mm^i?>nfzzr3 K©BamHI©T 
mi^mconmm^^l^^tio cn^rEcoRIiPstl-^llJC-tJOKf Ls -:Sf. pISX^R 
i;$iJPS^S'C'-eJ»f"^S^<t{wJ:t)> :^hyy=i K:^©il:^fcBamHmfit^^AL 
^co CCD^vT^^ K^Sall-eWirL^^. 4 acDdNTP#:&T-t?DNA>f. U ^ 5 — fe* 
Klenow^f^T-^ffl^/^TFilll inS:JC^?T^j:^^v ?l Hindi II-etU»f L^o Ctl 

{C. JeiT©2«©-^fiScDNA*>b»fiScST:5^3?'r5'-^«A-rS»{CcfcOT ^ 7 mif§#J 

{cNdell^fit^^ALyrio m*>^ 

5' -CATATGACCCTTGCTATCCGACCCA (T^tiNdel) (Ba^J#^ 1 1) RO^ 

5' -AGCTTGGGTCGGATAGCAAGGGTCATATG ( 1 mmUMel^^ 2 «^«HindII lO 

—§15) (iE^iJ#-^l 2) 

pDes9Nde) "iT^ (Molecular cloning pp. 250-251; 1982) (C?^feoTp^ L/c 
:^lS®i*BL21(DE3) (Novagen)<D=i >b*-r ^ h-fevKc^AL^ T^fv/'J^Btt 
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JCcfcSSSUC J: «9 3^®fe^^B L D E S 1 ^»/::o 

B LDE S 1 S.C;^pET3a®*^Wr -6 8121^ (BLl) ^100nil©M9^Jtfe ( 
200 Aig/inl®T>'bri^'; >. 4 mg/ml lO^^M FeC13. 0.5;fg/mlh' 

9 X y^\, 1 mg/ml;^-r 5 Jm^'ktS^ {C^«L. 3 7^-C=^«L:feo 

ifefi600niD-e0.50.D.lcn£5*t?^#^^Jt^^. ^ V ^ n tfyl/^^;*-;^ 
hi/ K(IPTC)^**^^lgliDM{C^j:-SJ:^{c:Jn;i/Co M{-1^Ph^^4IL. A 9 
fe^fi&sftl-fbas^ite^O^^^^^^^- lEliRL/c;^MB-<l^^y h^l.2!« NaCl 
-e^o^^. flg^^ffltii Lfco H&KflBlighiDyertD:^^ (Can J. Biochem. 
Physiol.. 37: 911. 1959) {CS&oTttaiL. 2. 5 ml© 5 ^ rJ' / ->'^-e^ 

t^?&tt'fi:LT85'c 2 ^wi^wj^-^^mmWi^^f- ^MtLtzo ^i^tzmmm^ 

y^C-R7A plus (S^Mf^Hlr) ^fflV^^o ^t-^^o 





16:0 


16:1 


18:1 (11) 




BLl (0 I^Fb^) 


4 7 


2 0 


2 9 


4 


BLl (1 J^FbI) 


5 0 


1 7 


2 9 


4 


B L D E S 1 (0 nf^') 


4 4 


2 2 


3 0 


4 


1 B L D E S 1 (1 ^Fb^) 


4 0 


2 8 


2 8 


4 



C C -eNpFal « IPTGtC J; S rS' ^ ©^^NprB^^^t-o 

BLDES l-e«l 6 : 1 *<liinLTW^-SC<i:7&^Hgt>*^-C*So fip-^.^ ^Sfe 

IKLAztCi;?). BL l:^{Cit'<BLDES 1 "eii 1 6 : KD^fi^-t. 18:1 
(9) t>^fiJtL> des 9 nid - Kf S jf^ U ^T'^ K(± 1 6 : 0 t^;5^ 0 T?n£ < 
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Anacystis nidulansS*®des 9 nidit^K^^iJ^® J: 9 {w LX ^ ^<=i i^m^i^^ 

(1) f^'®*^^ 
pDesQNde^SacI <hSalITill!J»ft-5*J- cfc e)MS^^OiO»fS|5&-e4fet tL^des 9 

PSNIP9 (Schreicherb. EMBO J. 4,25(1985)) *^ t)^^^*-^® transi t@2^ij^ 
HindllltSphlTfiODaiL. -etliinl-cD^ijm^^T-ejSf L/cpUC118{c d-^ 
:/^-r5C:<h{Cj:f9> transitBE^J©To{t{C"r;l/^:^ D-->^^'b--r h^W^^ 
K (pTRA3) ^#^o COHindlll-b-'f h *iai»f^Klenow®£5g-eFi 1 1 
inLXbal U (pTRA3X) o CO:^^:^^ KpTRASX^Sal liSac 

Itr-ejgf $ oT?iytdes 9 nidate?^ 

»fit^i*AL/c (pTRA3Xdes9) o FT «RuBisCO© trans i tB2?iJ(c 

^ntlRl— ®Sg^1%-edes9 niditfe^?5^^IR^nSo Ctl^Sac I 

S KpBI121(Clonetech)^$Jj|Sffit#!lSacI<i:XbaI-e«»f L-C»^^-5X 
5 KpBK-GUS)lii8-Glucuronidaseite^=^ (GUSSfe^) ^'^AjX^thH'^ Ctl 
\Zt} ^) V 3 —^e-^ ^ i7 ^ ^ 7l/XCD35Sy P * - :^ - «i: y ^-^ U V-^Sfcl^^ ( 

Nos) ^-x^--$'-<DmKm^Lfz^Km.^=i-^nK^/hz.t\z^K>. 

(Dm^Am^i^ ^ - (pBI121(-CUS)Rbsc-des9) ^W^fZo 

( 2 ) pBI121(-GUS)Rbsc-des9©T^'p/<^ X U ^ J^^<D^A 
Agrobacterium tumefaciens LBA4404 (Clonetech)^50nil©YBB:^l|fe (1 1^^ 
^ h* - 7 4^ X 5 m-^^^ ^ I 6- ^-f h > 1 3 *S 5 g^ 2mM 

MgS04(pH7.4)) {C^«L. 28'C-^24I^P^i^«^. :©il?g[^3, OOOrpm. 4^. 20^ 
©it^ll^•t?^»L/Co lOml® 1 mM Hepes-K0H(pH7. 4)-?? 3 IsIiSfe o 3 

ml<D105«i!^U -feP-^l'T- 1 iHl^Jfe^^. fti^fi^J- 3 mlO 10S5^^ U -t n - yUfCj®® L 

C® i: -9 (C LT?i^a^S[50jtt lRa'HulB®y5 X i FpBI121(-GUS)Rbsc-des9 
1 jtzg^^jx^y htcAtlN Jc-Ui? haTHU—v/a (Gene Pulser: 
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BioRad) ^m^^t: 2500V. 200Q ©^{«^■T?«M^•?>'^X^*^^:)•. ^5 

j;^^ KDN A^T^n/<^-r U 'i? Afc^ALfCo C®M^$^ J- y > F>'U:7 f- 3. 
-y{Cig?Lx 800 10 SOC^ilfa (1 l^fz^D h V h > 20 ^5-©^^;^ 5 
NaCl 0. 5 g. 2. 5 mM KCU 10 mM MgS04, 10 mM MgC12, 20 mM 
PH7.0) ^*nx.. 28t:-C? 1.5^m$^m^^LfZo Z(Di^mm SO/^l^. lOO 
ppmCDiJi--^^ -y^'^^ti^ YEB«5^^1I6 1.2X) ±tct^. 28V^ 2 B 

^des9 nidite^WfJt^^^'o-':^*!: L^iMf >^5-*ftcJ; ^^X^ K 
pBI121(-GUS)Rbsc-des9<£r^/^T?<'^ S CI i ^SIEl L /::o C ® Agrobacterium 
t ume f ac i ens ^ ALBBSDES <i: Pf -S^'o 
(3) ^^<=JO?^SIte^ 

_hiS®SmLBBSDES^s SOppmO^ •> >^^t;LB^{*:^*t?28'C. 2^ 

F^^<i:9^«L/io :^«?K1.5 inl>£: 10, OOOrpm. 3 ^S-FaligiC.^ LT^®^. 

v'>j£-I^< y::ii>{cl ml<DLB^m-ei5fe^L/Co MiClO. 000rpm> Z^THi^'L^LX 

— K^©MS-B5^ilfe (^>>*.'UT-r— >1. 0 ppm. 0.1 ppm. S 

^♦^^ 0.8 X^^tr) ( Murashige, T. and Skoog, F. Plant Physiol.. 15: 
473, (1962)) JbC "7 h ^ > No. Ij^lS (0 7.0 cm) ^g^. 
^^±{cLTS^MV^/^o v'-^'-U'^^^•5 7^yl/A-ev'->'^Ls 16^Fb^HJ. 8 
I^PalB^O^^t' 25*^. 2Bfii^^LfZo oV>T ^ 7 * 5 > 250 ppin<£:#t; 
MS-B5i#i|fe±{C^ InHtC IOHFbI^H LTT ^n/<^ 7^ U '^7 A^I^* Ly;io 
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Et-^^^*^^ 250 ppm^tftii-^^ 100 ppm^^O MS-B5^Jft±J-S^ 

100 ppm^^t: MS-HP^Jtfe JVT y':=.>:ELZ/i-y ^ U >mi^^ 

^tt£^^ MS-B5^i&) tzmmLfzo 10BF5^«^. mULfziy h '^iJ i-^ 
^ iy >m^<Dmn^^i^t L. >' hj^-y ^X^O^'^^ :^ 250 ppm«r# 

CIIJfe««J 5 3 mnUm /<3 O >fV A -9- if yS-OV - -tf- > :^«T 

^« (Rogers, S. 0. & Bendich. A. J.: Plant Molecular Biology Manual A6; 

1(1988)) {c^ifeoT^Tnj:-^/::o in-&> 2 g<Di^y<=i<Dm^i^^Bmi^-C'n^l^^ 
C T A Btttl3i^«?S"ey y ADN A^iifco 1 0 g ® D N A ^$iJRS^^BcoRI 
«i:XbaITiaj»f^0.7XT7y'a-xyVU-em^«c«iL. ^<D'^±^ ^ >m (Hybond 
N+; Amersham) {CO. 4 N NaOHtf^^L/Co C ©a{CpTRA3Xdes9*^ 9> trans i t# 
t©^^5Rl'fk»3gate^^3?'o-y«i:LT. > .6 5 'iC-e 1 6 I^H^n-T ^ U -fe' 

SWiteW^^<3^>' A{C*a^ii*tL-Cl,^S C t «:5i 

^nf^-ofzo yjmti^T—i^'t^ -i-'^^ -^T >mizji^m\ii^n-f^^^ (Nagy, f. 

t>: Plant Molecular Biology Manual B4; 1 (1988)) . poly(A)+RNA^;^>'l/A 
T-'UT^b KA^ cDT^o ~;^y;U-emM2^1b^. i-'i u >m (Hybond N; 
Amersham) {cSfe^L. ifif V^iH^O^W U -fe'- a ^ ^./^lo 

;^>«?6DM©RNA^M^LTl/^5ffl<**<*>-^/::*^s -5•<^>'^'*^ t>^3^S©^l,><@^ 

mnm 5 -cRNAOiei^&^*<«i^^$nfc^^<^?^Hg^{^. :sLZ/nmt ur 

pBI12li?J^®fe^L/:::J'^<=i®^*"t>s J£lT®::S"^«-«k «9 5^:^^ rf^i^-'l'^ U -fe 
n-;l/ (PG) > y^;l/-7*^y 'J-'->/l'v^T'>>'^^^y-lrn-;U (SQDG) ^<Dm 
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(1) ±m3^(Di&tii 

)ig||©atilliBligh-Dyer^ (Can J. Biochem. Physiol., 37: 911, 1959) "tf 

C:n{C20 mlO^ o P7}^;i/A : y ^ y -71^ (1:2. <4^Sfit) *An^> Jh^i^:^ 
-r-!f--eS^lK?S'^- 15:^PH^#eL/;:o cnfwiJ' o D;i-s/l/A 12mis.D^II®7k 
12ml ^iO;tSfcL<^S-^L^^. SOOOrpm . 4 "t:. 305^r^<Dig;il^-e7KS <!:*^® 

,j -i/<;}|p-:^-^ffiU, SO'CMJETTf^ii^^^'^^o Cin*2 ml© 
^^OD7^/l/A : ^^y-yl/ (1:4. (*SJt) (C^*^L. ^lli®!fiai«3 1 L^o 

( 2 ) fli@©:a-H 

DBAE-Toyopearl 650C (^V-) ©!SM^^2. 5ml<& 1 MB^K:^ h U 'i? AtK^?^ 
(pH7.0)25 mltjS-lfi^SJSIi L^to dtl^s IIStK^ ^ ^ -^'l^'ejlISefe^Sfe^^ 

50 ml©^ np*7VA : ^ ^ J—J\^ (1 : 4. -H^^Jt) tf^fej^L^o 

^J3M®tttti«Kl^^ -5 50 ml® i7 uu Tix^VA . :f- ^ J —JU ( 

1:4^ -f^^^it) 12^ J if h 'yJ^i^T i^Jl^yV -iz^-Jl^ (MGDG) . i^if 
•5 h i/;l/>?Ti'-'l/^U -feo— ;l/ (D GDG) . ^^7.y rf'i^Jl'^^ J — )\yT X 
> (PE) X 7f:X7r^':^-'^=' U V (PC) LT. tfJ^flM® (MGDG> 

DGDG. PEs P C) M^J-i L^o 5 ml©B^le-e7^X7 r 5^ >*>'l/-lr U 

(PS) ^^th LTHS^^x EOml©^' DO;i-x;l/A : p< ^ y (1:4. {*SfJl:) 
-ei^K^^^L/::^> 50 mlO^ aa;^yVA : > — ;U : 10 Mi^i?T>*— 
A7K^?& (20:80:0.2. #:«Jt) "^PG. SQDG. X 7 r ^ ^ h- 

yl/ (PI) ^^t^i^^^^i/^o 15ml(DJ:.^y-JU^1]U^. ^EETT- 

^fiS^l^l'^^o Ctl^O. 2ml©^ n os^y^A : ^ ^ y — yl/ (2:1. 't4^^ifc) {d 
^*^L. iJttlliM (PG. SQDG. Pl^m^tLtZo 

1 9 



wo 95/18222 



PCT/JP94/02288 



MGDG. DGDG. PE. P CM^it. ^-fWttf^M.^zt^hi^'yy^- 

1 mi{c^;i)^L^U3j2^<&^o^'^wl/A-e^^e'^hL/r;{;^A^c:;^>^{t. ^ n a 
T-irhV (4:1). T-Jrhi'. > :^ -^l^-eJUtC^tBt- S mmW (MOD 
G. DGDG) tiT-fehve. ^) yfUm. (PC. PE) y -yl/7f^j±}$n 

(3) mmi7 hiT^y ^ - (TLC) {c^-SPGO^IIiiltigiJIifl*^:^'^ 
(2) t?li/c®^^v/ U 7f7y;U-TLCy h#5721 (Merck) -^r^-fi L /"Co 

m. :7k (50:20:10:15:5. f*:«fii:) ^^ttfliMoJig^ti ^ o n : y 

: 7k (70:21:3. ftiStJt)^^ l^^to TLC-e^lSI^. A U > (8 0 %T-i? 

^e^'^ics ml©^ rJ'y -yi/#5 %ig®e^ipx.. ^:^^*fT85't3-e 2 ^K^SlS 
ll&a&K^^^^WbL/iio — :*r. sn-l. 2 ri©Jigflj?Km^^^ii)5y::a6 
JC. BiJt)IK^/::'> 5 m 1 pn<J^yl/A : ^ -yU (2 : DM 

?£"eil&@^islJR L. $&@L/im. 1 iBl®50 mM TrisCl (pH 7. 2)35.Df0. 05 % 
Triton X-100 *Anx.. Sfc L < LTflg® *:»-tfe$ -a:T. ^ y X :;57 h* ( 
Rhizopus delemar) ( 2 5 0 0 U ; ^- U VT^-tt) ^HU^S 7 

•0-^=3 O^Fal^za-T'SCiJCcfcJjiltRfi^jtcs n-lfi[©J!IM^<£::$3-»$-li-/Co C 
<Dm.ft^mm'S:mfi&^. TLC (i^na;^;l/A : r-fe h> : p^^y-^U : : tK 
= 10:4:2:3:1), t;:<fc *RfSOJ!&!g. U S.OfJl&JKrMJC:53'llt L^o 

h/N-y ^C-R7A plus (a^MflFm) ^fflV^A:o 2 ^. PGtC 

oV>TI^3^. ■€-®fife©ft^fi<jnj:fli®r<tO:S-^M«<£-||4^{c:^-ro ^(i. >t* 
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tSS 4/71 


16:0 


16:1 


16:2 


16:3 


18:0 


18:1 


18:2 


18:3 


116:0+18:0 


SQDG 






1 


0 


0 


3 


2 


7 


36 


54 






Ou 


22 


0 


0 


0 


4 


9 


28 


36 


MGDG 




7 


0 


1 


9 


1 


2 


4 


76 


8 








g 


• 1 


10 


0 


1 


5 


69 


3 


DuDu 




1 v7 


0 


0 


0 


3 


1 


4 


73 


22 






9 


13 


0 


1 


0 


1 


5 


70 


9 


PC 




28 


0 


0 


0 


5 


1 


21 


44 


33 






19 


12 


0 


0 


3 


4 


40 


23 


22 


PE 




20 


0 


0 


0 


3 


1 


6 


70 


23 






18 


10 


0 


0 


2 


2 


31 


38 


20 


PI 




48 


1 


0 


0 


2 


1 


11 


37 


50 






44 


i ' 


0 


0 


1 


2 


18 


28 


45 



PG{::i^^L/;:l!iJ»i^^^<^*S^*'^- Anacystis nidulansS^OSiM^^tS 
^K). ^(Di}^t>^iZl 6 : 1 c i s*<iix.Tl.^S*. ^tz. {i^&tE^itl 8 : 0 
t^nmm ( 1 6 : 0 + 1 6 :• 1 t r a n s + 1 8 : 0 (Xt^T U y^) ) 

m^ib-ih. sn-2«9 S%&.±mmmmWi (l e : OX^tl e : l t r an 
s) X'^ib^tlXisO. ^fz\Zi&Bi^mAizXy)^0SLLfzl 6 : 1 tif -^T s n 

.o® p G® s n - 1 &.(Dmummmmi^x{i>ti < o-ci^ s c t^a^Bj 
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— ^®ffe©fliR<2>MGDG. DGDG. SQDG. PC. PEx PIT?*>> 
1 6 : 0<Dm{S>t. ^n{CO¥JSL/cl 6 : 1©1 0 %HfI^OiiSn*<B« ^>*^t?* <9 . 
ttz. 1 8 : 0 cr>^^^mi\:^M^X'^^■hio C(Dot>. MGDG<i:DGDG{c:o^.^ 
T(il 6 : lO^mt^tLX s n- imzib-ofzi)^ s n - 2 fi:^^e> 
tii$tLf;:o MGDG> DGDG. S Q D GRO^P G {i±{c»^^4c{c#«t-5flg!S 

Ctlb© 4ffi®fliKf^Anacystis nidulansOMf- ^ 
Anacystis nidulans©Jigt-«#^L<£l.^liM-e* L^^ t>iiE^«t^-e«i{C|| 

C©J;-5{C. J|^®^^^^<3©fli!S:^*T®M^A^^x Anacystis nidulansi*: 
<i:$i:T®l|g®0 1 6 : 0 i 1 8 : 0 *€ii6'CS&^^ < ^^fO'fk'^^ S C <!:<£: 2^:^ 





16:0 


16:1 


16:2 


16:3 


18:0 


18:1 


18:2 


18:3 


216:0+18:0 




26 


0 


0 


0 
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47 


21 


31 




17 


13 


0 


0 


2 


4 


58 


6 


19 



C©i^m*^e». Anacystis nidulansS3fe©liMft&M^tafO'fk®^^«s 1^<^^C 
ilC. mo*n£ t»-rS{-*5^^'^^ 1 6 : Oil 8 : 0 ©:?^fi&^0'fk*«ljK L^:: C t 
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immm 7 ] mnmi^ ^^^^ o^mm^um 

tr Ms-HF^jtefc^^. 2 5°c. I 6mrsm. 8mm^(DBM:X'2Mmm^m^ti 

3>hD-yHfi^ (pB 1 121 {c«t«5?gSfi^L trtiUfc^LT 
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ie^jCDS^ : 1 9 6 

m<DWL : — 

E^joai® : Genomic DNA 

^i^ig : Anabaena variabilis 
t5f5« : lAM M-3 

GCT CTG GGG TTG TTG CTG TTA TAT CTA GGC GGG TGG TCT TTT GTG GTC TGG 

GGA GTT TTC TTT CGC ATG GTT TGG GTT TAG CAC TGT ACT TGG TTG GTA AAC 

AGC GCT ACC CAT AAG TTT GGC TAG CGC ACC TAT GAT GCT GGT GAG AGA TCC 

ACT AAC TGT TGG TGG GTA GCT GTC CTA GTG TTT GGT GAA GGT T 

mnm^ •. 2 
mnoS:-^ : 6 5 
mn<Dm : t ^ 

^#JiS : Anabaena variabilis 
: lAM M-3 

mn : 

Ala Leu Gly Leu Leu Leu Leu Tyr Leu Gly Gly Trp Ser Phe Val Val Trp 
Gly Val Phe Phe Arg He Val Trp Val Tyr His Cys Thr Trp Leu Val Asn Ser 
Ala Thr His Lys Phe Gly Tyr Arg Thr Tyr Asp Ala Gly Asp Arg Ser Thr Asn 
Cys Trp Trp Val Ala Val Leu Val Phe Gly Glu Gly 
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BB^J#-^ : 3 
ie?iJ©S$ : 8 3 7 

E7"J®S^ : Genomic DNA 

mm 

: Anacystis nidulans 
1*^ : R2-SPC 

ATG ACC CTT OCT ATC CCA CCC AAG CTT GCC TTC AAC TOG CCG ACC GCC CTG 
TTC ATG GTC GCC ATT CAC ATT GGA GCA CTG TTA GCG TTC CTG CCG GCC AAC 
TIT AAC TGG CCC GCT GTG GGC GTG ATG GTT GCG CTG TAT TAC ATT ACC GGT 
TOT TTT GGC ATC ACC CTA GGC TGG CAC CGG CTA ATT TCG CAC CGT AGC TTT 
GAA GTT CCC AAA TGG CTG GAA TAC GTG CTG GTG TTC TGT GGC ACC TTG GCC 
ATG CAG CAC GGC CCG ATC GAA TGG ATC GGT CTG CAC CGC CAC CAT CAC CTC 
CAC TCT GAC CAA GAT GTC GAT CAC CAC GAC TCC AAC AAG GGT TTC CTC TGG 
AGT CAC TTC CTG TGG ATG ATC TAC GAA ATT CCG GCC CGT ACG GAA GTA GAC 
AAG TTC ACG CGC GAT ATC GCT GGC GAC CCT GTC TAT CGC TTC TTT AAC AAA 
TAT TTC TTC GGT GTC CAA GTC CTA CTG GGG GTA CTT TTG TAC GCC TGG GGC 
GAG GCT TCG GTT GGC AAT GGC TGG TCT TTC GTC GTT TGG GGG ATC TTC GCC 
CGC TTG GTG GTG GTC TAC CAC GTC ACT TGG CTG GTG AAC AGT GCT ACC CAC 
AAG TTT GGC TAC CGC TCC CAT GAG TCT GGC GAC CAG TCC ACC AAC TGC TGG 
TGG GTT GCC CTT CTG GCC TTT GGT GAA GGC TGG CAC AAC AAC CAC CAC GCC 
TAC CAG TAC TCG GCA CGT CAT GGC CTG CAG TGG TGG GAA TTT GAC TTG ACT 
TGG TTG ATC ATC TGC GGC CTG AAG AAG GTG GGT CTG GCT CGC AAG ATC AAA 
GTG GCG TCT CCA AAC AAC TAA 
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BE^JOS^ : 2 7 8 

mm 

: Anacystis nidulans 
tSfejg : R2-SPC 

Met Thr Leu Ala He Arg Pro Lys Leu Ala Phe Asn Trp Pro Thr Ala Leu Phe 
Met Val Ala lie His He Gly Ala Leu Leu Ala Phe Leu Pro Ala Asn Phe Asn 
Trp Pro Ala Val Gly Val Met Val Ala Leu Tyr Tyr lie Thr Gly Cys Phe Gly 
lie Thr Leu Gly Trp His Arg Leu lie Ser His Arg Ser Phe Glu Val Pro Lys 
Trp Leu Glu Tyr Val Leu Val Phe Cys Gly Thr Leu Ala Met Gin His Gly Pro 
He Glu Trp He Gly Leu His Arg His His His Leu His Ser Asp Gin Asp Val 
Asp His His Asp Ser Asn Lys Gly Phe Leu Trp Ser His Phe Leu Trp Met He 
Tyr Glu He Pro Ala Arg Thr Glu Val Asp Lys Phe Thr Arg Asp He Ala Gly 
Asp Pro Val Tyr Arg Phe Phe Asn Lys Tyr Phe Phe Gly Val Gin Val Leu Leu 
Gly Val Leu Leu Tyr Ala Trp Gly Glu Ala Trp Val Gly Asn Gly Trp Ser Phe 
Val Val Trp Gly He Phe Ala Arg Leu Val Val Val Tyr His Val Thr Trp Leu 
Val Asn Ser Ala Thr His Lys Phe Gly Tyr Arg Ser His Glu Ser Gly Asp Gin 
Ser Thr Asn Cys Trp Trp Val Ala Leu Leu Ala Phe Gly Glu Gly Trp His Asn 
Asn His His Ala Tyr Gin Tyr Ser Ala Arg His Gly Leu Gin Trp Trp Glu Phe 
Asp Leu Thr Trp Leu He He Cys Gly Leu Lys Lys Val Gly Leu Ala Arg Lys 
He Lys Val Ala Ser Pro Asn Asn 

iE3«J#-f- : 5 
iE2fiJ©S$ : 1 8 

K^J©M : mm 
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ATGACAATTG CTACTTCA 

15^iJ#-f- : 6 
iE3^JO:g$ : 1 5 

se^j©M : 
m<Jo>m : — ^ift 

mncDUm : fife CD ^^DNA 

GCTCTGGGGT TGTTG 

BE^J#-^ : 7 
SE^J©S$ : 1 5 

IB^J : 

CAACAACCCC AGAGC 

B2^iJ#-^ : 8 
ee^iJO:^^ : 1 8 

2 8 
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mn : 

RTGRTGRTTR TTRTGCCA 

rnnoS:-^ : 1 "7 
ia3?>J©M : 

mm(^Mm mourn -^rkdna 

ACGTCATGGC CTGCAGT 
iB^JC0S$ : 2 6 

CGCGGATCCT TAGTTGTTTG GAGACG 

1E?|J#-^ : 1 1 
1S^J®S$ : 2 5 

ie^j©^ : 
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CATATGACCC TTGCTATCCG ACCCA 



my^m^ • 1 2 

iE^J©ft$ : 2 9 

AGCTTGGGTC GGATAGCAAG GGTCATATG 
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It 5^ O IS ffl 
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mi 



des9var 
MSCD2 



10 20 30 40 . 50 60 

AL6LLLLYLGGWSFVVWGVFFRIVWVYHCTWLVNSATHKFSYRTYDA6DRSTHCWWVAVL 



I •••••• 



LVPWYCWGETFVNSLCVSTFLRYAVVLNATWLVMSAAHLY6YRPYDKHISSRENILVSMG 
2U 250 260 270 280 290 



des9var 
MSCD2 



VFGEG 

:x 

AVGER 
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disSnid 

M5CD2 

d e s 9 n i d 
MSCD2 

desSni d 
MSCD2 

MSCD2 
MSCD2 

d«s9nid 
M$CD2 



10 20 30 40 SO 60 

MTLAIRPKLAFNWP.TALFMVAIHIGALLAFLPANFNWPAVGVMVALYYITGCFG1TL6WH 

V . . . • .... • • • • • • . . 

A • 

OOEGPPPUEYVWRNIILMALLHLGALYGITLVPSCUYTCLFAYLYYVISALGITAGAH 
60 70 80 90 too 110 

70 <0 90 1 00 no 120 

RLISHRSFEVPKWLEYVLVFCGTLAMQHGPIEWIGLHRHHHLHSDQOVDHHDSNKGFLWS 



• • « 



BLWSHRTYKARLPLRLFLI lANTMAFQNOVYEWARDHRAHHKFSETHAOPHNSRRGFFFS 

120 no MO 150 ISO no 

130 HO ISO 160 170 

HFLW-MIYEIPA-RTEVOKFTRDIAGOPVYRFFNKYFFGVQVLLGVLLYAWGEAWVGNGW 

V • • • • • ' • .II 

HVGWLLVRKHPAVKEIGGKLOMSOLKAEKLVMFQRRYYKPOLLLMCFYLPTLVPWYCWGE 
180 190 200 210 220 230 

180 . 190 200 210 220 230 

Spv VWGIFARLVVVYHVTWLVHSATHKFGYRSHES60QSTNCWWVALLAFGE6WHNN 



• ••••• • • 



TFYNSLCVSTFLRYAVVLNATWLVNSAAHLYGYRPYDKNISSRENILVSMGAVGERFHMY 
240 2S0 260 270 280 290 

240 ' 2S0 260 270 

HHAYOYSARHGLQMEFOLTWLllCGLKKVGLARnKVASPNH 

. • • • • • • • • • 

••••••a ■ • » •••• 

HHAFPYDYSASEYRWHINFTTFFIOCMALLGLAYORKRVSRAA 
300 310 320 330 340 
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COMPOSITIONS AMD nfiy.fi 

5 

This application is a continuation-in-part of USSN 
07/494,106 filed on March 16, 1990 and a continuation-in- 
part of USSN 07/567,373 filed on August 13, 1990 and a 
continuation-in-part of USSN 07/615,784 filed on November 
10 14, 1990. 



Technical F-i^ld 

The present invention is directed to desaturase 
enzymes relevant to fatty acid synthesis in plants, 
15 enzymes, amino acid and nucleic acid sequences and methods 
related thereto, and novel plant entities and/or oils and 
methods related thereto. 

INTRODPCTIQM 

20 Bacj^ggQunci 

Novel vegetable oils compositions and/or improved 
means to obtain or manipulate fatty acid compositions, from 
biosynthetic or natural plant sources, are needed. 
Depending upon the intended oil use, various different oil 

25 compositions are desired. For example, edible oil sources 
containing the minimum possible amounts of saturated fatty 
acids are desired for dietary reasons and alternatives to 
current sources of highly saturated oil products, such as 
tropical oils, are also needed. 

30 One means postulated to obtain such oils and/or 

modified fatty acid compositions is through the genetic 
engineering of plants . However, in order to genetically 
engineer plants one must have in place the means to 
transfer genetic material to the plant in a stable and 

35 heritable manner. Additionally, one must have nucleic acid 
sequences capable of producing the desired phenotypic 
result, regulatory regions capable of directing the correct 
application of such sequences, and the like. Moreover, it 
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should be apprecla-ted -that to produce a desired modified 
oils phenotype requires that, the Fatty Acid Synthetase 
(FAS) pathway of the plant is modified to the extent that 
the ratios of reactants are modulated or changed. 
5 Higher plants appear to synthesize fatty acids via a 

common metabolic pathway in plant plastid organelles (i.e., 
chloroplasts, proplastids, or other related orgsinelles) as 
part of the FAS complex. Outside of plastid organelles, 
fatty acids are incorporated into triglycerides and used in 

10 plant membranes and in neutral lipids. In developing 

seeds, where oils are produced and stored as sources of 
energy for futiare use, FAS occurs in proplastlds. 

The production of fatty acids begins in the plastid 
with the reaction between Acyl Carrier Protein (ACP) and 

15 acetyl-CoA to produce acetyl-ACP . Through a sequence of 
cylical reactions, the acetyl -ACP is elongated to 16- and 
18- carbon fatty acids. The longest chain fatty acids 
produced by the FAS are 18 carbons long. Monunsaturated 
fatty acids are also produced in the plastid through the 

20 action of a desaturase enzyme . 

Common plant fatty acids, such as oleic, linoleic and 
a— linolenic acids, are the result of sequential 

desaturation of stearate. The first desaturation step is 
the desaturation of stearoyl-ACP (C18:0) to form oleoyl-ACP 

25 (018:1) in a reaction often catalyzed by a A-9 desaturase, 
also often referred to as a "stearoyl-ACP desaturase" 
because of its high activity toward stearate the 18 carbon 
acyl-ACP. The desaturase enzyme functions to add a double 
bond at the ninth carbon in accordance with the following 

30 reaction (I) : 

Stearoyl-ACP + ferredoxin (II) + O2 + 2H+ -> 
oleoyl-ACP + ferredoxin (III) + 2H2O. 
A-9 desaturases have been studied in partially 
purified preparations from numerous plant species . Reports 

35 Indicate that the protein is a dimer, perhaps a homodlmer, 
displaying . a molecular weight of 68 kD (±8 kD) by gel- 
filtration and a molecular weight of 36 kD by SDS- 
polyacrylamide gel electrophoresis. 
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In subsequent sequential steps for triglyceride 
production, polyunsaturated fatty acids may be produced. 
These desaturations occur outside of the plastid as a 
result of the action of membrane -bound enzymes. Additional 
5 double bonds are added at the twelve position carbon and 
thereafter, if added, at the 15 position carbon through the 
action of A-12 desaturase and A-15 desaturase, 
respectively. 

Obtaining nucleic acid sequences capable of producing 
10 a phenotypic result in FAS, desaturation and/or 

incorporation of fatty acids into a glycerol backbone to 
produce an oil is subject to various obstacles including 
but not limited to the identification of metabolic factors 
of interest, choice and characterization of a protein 
15 source with useful kinetic properties, purification of the 
protein of interest to a level which will allow for its 
amino acid sequencing, utilizing amino acid sequence data 
to obtain a nucleic acid secjuence capable of use as a probe 
to retrieve the desired DNA sequence, and the preparation 
20 of constructs, transformation and analysis of the resulting 
plants . 

Thus, the identification of enzyme targets and useful 
plant sources for nucleic acid sequences of such enzyme 
targets capable of modifying fatty acid compositions are 

25 needed. Ideally, an enzyme target will be amenable to one 
or more applications alone or in combination with other 
nucleic acid sequences relating to increased/decreased oil 
production, the ratio of saturated to unsaturated fatty 
acids in the fatty acid pool, and/or to novel oils 

30 compositions as a result of the modifications to the fatty 
acid pool. Once enzyme target (s) are identified and 
qualified, quantities of protein and purification protocols 
are needed for sequencing. Ultimately, useful nucleic acid 
constructs having the necessary elements to provide a 

35 phenotypic modification and plants containing such 
constructs are needed- 
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A 200-fold purification of Carthamus tlnctojrlus 
("saf flower") stearoyl-ACP desaturase was reported by 
McKeon & Stun^f In 1982, following the first publication of 
5 their protocol In 1981. McKeon, T, & Sturapf, P. 

J.Blol.Chem. (1982) 257:12141-12147; McKeon, T. & Stumpf, 
P. Methods in Enzymol. (1981) 71:275-281. 

BRIEF DESCRIPTION OF THE DRAWINGS 

10 Fig. 1 provides aiolno acid sequence of fragments 

relating to C. tinctorlus desaturase. Fragments Fl through 
Fll are also provided In the seq[uence listing as SEQ ID NO: 
1 through SEQ ID NO: 11, respectively. Each fragment 
represents a synthesis of sequence Information from 

15 peptides originating from different digests which have been 
matched and aligned. In positions where there are two 
amino acids indicated, the top one corresponds to that 
found In the translation of the cDNA; the lower one was 
detected either as a second signal at the same position of 

20 one of the sequenced peptides, or as a single unambiguous 
signal found in one or more of the overlapping peptides 
comprising the fragment. Residues in F9 shown in lower 
case letters represent positions where the called sequence 
does not agree with that predicted from the cDNA, but where 

25 the amino acid assignment is tentative because of the 
presence of a contaminating peptide. The standard one 
letter code for amino acid residues has been used. X 
represents a position where no signal was detectable, and 
which could be a modified residue. Fl corresponds to the 

30 N-terminal sequence of the mature protein. The underlined 
region in F2 is the sequence used in designing PGR primers 
for probe synthesis. 

Fig. 2 provides a cDNA sequence (SEQ ID NO: 12) and 
the corresponding translational peptide seG[uence (SEQ ID 

35 NO: 13) derived from C. tinctorius desaturase. The cDNA 

sequence includes both the plastid transit peptide encoding 
sequence and the seqpience encoding the mature protein. 
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Fig. 3 provides cDNA sequence of Ricinus communis 
desaturase. Fig, 3A provides preliminary partial cDNA 
sequence of a 1.7 kb clone of R. communis desaturase (SEQ 
ID NO: 14) . The sequence is from the 5' end of the clone. 
5 Fig, 3B provides the complete cDNA sequence of the 
approximately 1.7 kb clone (SEQ ID NO: 15) and the 
corresponding translational peptide sequence (SEQ ID NO: 
16) . 

Fig. 4 provides sequence of Brasslca campestrls 

10 desaturase. Fig. 4A represents partial DNA sequence of a 
1.6 kb clone pCGN3235 (SEQ ID NO: 17)^ from the 5' end of 
the clone. Fig. 4B represents partial DNA sequence of a 
1,2 kb clone, pCGN3236, from the 5' end of the clone (SEQ 
ID NO: 18). Initial sequence for the 3' ends of the two B. 

15 campestjrls desaturase clones indicates that pCGN3236 is a 
shorter cDNA for the same clone as pCGN3235. Fig. 4C 
provides complete cDNA sequence of B. campestrls desaturase 
above, pCGN3235 (SEQ ID NO: 19) and the corresponding 
translational peptide sequence (SEQ ID NO: 20) . 

20 Fig. 5 provides preliminary partial cDNA sequence of 

Simmondsla chinensis desaturase (SEQ ID NO: 43) . The 
translated amino acid sequence is also shown. 

Fig. 6 shows the design of forward and reverse primers 
(SEQ ID NO: 21 through SEQ ID NO: 26) used in polymerase 

25 chain reaction (PGR) from the sequence of C. tinctorius 
desaturase peptide "Fragment F2" (SEQ ID NO: 2) . 

Fig. 7 provides maps of desaturase cDNA clones showing 
selected restriction enzyme sites. Fig. 7A represents a C. 
tijictorius clone. Fig. 7B represents a R. communis clone, 

30 and Fig. 70 represents a B. campestris clone. 

Fig. 8 provides approximately 3.4 kb of genomic 
sequence of Bee 4 (SEQ ID NO: 27) . 

Fig. 9 provides approximately 4 kb of genomic sequence 
of Beg 4-4 AGP sequence (SEQ ID NO: 28) . 

35 Fig. 10 provides a restriction map of cloned XCGU 1-2 

showing the entire napin coding region sequence as well as 
extensive 5' upstream and 3' downstream sequences (SEQ ID 
NO: 29) . 
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SOMMARY OF THE TWVEiroTQU 

By this Invention, con^ositlons and methods of use of 
5 plant desaturase enzymes, especially A-9 desaturases, are 
provided. Of special interest are methods and compositions 
of amino acids and nucleic acid sequences related to 
biologically active plant desaturases as well as seopiences, 
especially nucleic acid sequences, which are to be used as 
10 probes, vectors for transformation or cloning 

intermediates. Biologically active sequences may be found 
in a sense or anti-sense orientation as to transcriptional 
regulatory regions found in various constructs . 

A first aspect of this invention relates to C. 
15 tinctorius A-9 desaturase substantially free of seed 

storage protein. Amino acid sequence of this desaturase is 
provided in Fig. 2 and as SEQ ID NO: 13. 

DNA sequence of C. tinctorius desaturase gene (SEQ ID 
NO: 12) is provided, as well as DNA sequences of desaturase 
20 genes from a Riclnus (SEQ ID NO: 14 and SEQ ID NO: 15) a 
Brasslca (SEQ ID NO: 17 through SEQ ID NO: 19) and a 
Simmondsia (SEQ ID NO: 43) plant. 

In yet a different embodiment of this invention, plant 
desaturase cDNA of at least 10 nucleotides or preferably at 
25 least 20 nucleotides and more preferably still at least 50 
nucleotides, known or homologously related to known A-9 

desaturase (s) is also provided. The cDNA encoding 
precursor desaturase or, alternatively, biologically 
active, mature desaturase is provided herein. 

30 Methods to use nucleic acid sequences to obtain other 

plant desaturases are also provided. Thus, a plant 
desaturase may be obtained by the steps of contacting a 
nucleic acid sequence probe comprising nucleotides of a 
known desaturase sequence and recovery of DNA sequences 

35 encoding plant desaturase having hybridized with the probe. 

This invention also relates to methods for obtaining 
plant A-9 desaturase by contacting an antibody specific to 
a known desaturase^ such as C. tinctorius stearoyl-ACP 
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desaturase^ with a candidate plant stearoyl-ACP desaturase 
under conditions conducive to the formation of an 
antigen: antibody immunocomplex and the recovery of the 
candidate plant stearoyl-ACP desaturase which reacts 
5 thereto • 

In a further aspect of this invention DNA constructs 
comprising a first DNA sequence encoding a plant desaturase 
and a second DNA sequence which is not naturally found 
joined to said plant desaturase are provided. This 
10 invention also relates to the presence of such constructs 
in host cells, especially plant host cells. In yet a 
different aspect, this invention relates to transgenic host 
cells which have an expressed desaturase therein. 

Constructs of this invention may contain, in the 5' to 
15 3» direction of transcription, a transcription initiation 
control regulatory region capable of promoting 
transcription in a host cell and a DNA secpjence encoding 
plant desaturase. Transcription initiation control 
regulatory regions capable of expression in prokatyotic or 

20 eukaryotic host cells are provided. Most preferred are 
transcription initiation control regions capable of 
expression in plant cells, and more preferred are 
transcription and translation initiation regions 
preferentially expressed in plant cells during the period 

25 of lipid accumulation. The DNA sequence encoding plant 
desaturase of this invention may be found in either the 
sense or anti-sense orientation to the transcription 
initiation control region. 

Specific constructs, expression cassettes having in 

30 the 5* to 3' direction of transcription, a transcription 
and translation initiation control regulatory region 
comprising sequence immediately 5' to a structural gene 
preferentially expressed in plant seed during lipid 
acciimulation, a DNA sequence encoding desaturase, and 

35 sequence 3' to the structural gene are also provided. The 
construct may preferably contain DNA sequences encoding 
plant desaturase obtainable (included obtained) from 
Carthamus, Rlninus, Brasslca or Slmmondsia A-9 desaturase 
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genes. Transcription and translation initiation control 
regulatory regions are preferentially obtained from 
structural genes preferentially expressed in plant embryo 
tissue such as napin, seed-ACP or Bce-4 . 
5 By this invention, methods and constructs to inhibit 

the production of endogenous desaturase are also provided. 
For exait5>le, an anti-sense construct comprising, in the 5* 
to 3' direction of transcription, a transcription 
initiation control regulatory region functional in a plant 

10 cell, and an anti-sense DNA sequence encoding a portion of 
a plant A- 9 desaturase may be integrated into a plant host 
cell to decrease desaturase levels. 

In yet a different embodiment, this invention is 
directed to a method of producing plant desaturase in a 

15 host cell comprising the steps of growing a host cell 

comprising an expression cassette, which would contain in 
the direction of transcription, a) a transcription and 
translation initiation region functional in said host cell, 
b) the DNA sequence encoding a plant desaturase in reading 

20 frame with said initiation region, and c) and a transcript 
termination region functional in said host cell, under 
conditions which will promote the expression of the plant 
desaturase. Cells containing a plant desaturase as a 
result of the production of the plant desaturase encoding 

25 sequence and also contemplated herein. 

By this invention, a method of modifying fatty acid 
composition in a host plant cell from a given level of 
fatty acid saturation to a different level of fatty acid 
saturation is provided by growing a host plant cell having 

30 integrated into its genome a recombinant DNA sequence 

encoding a plant desaturase in either a sense or anti-sense 
orientation under control of regulatory elements functional 
in said plant cell during lipid acctimulation vinder 
conditions which will promote the activity of said 

35 regulatory elements. Plant cells having such a modified 
level of fatty acid saturation are also conten^lated 
hereunder. Oilseeds having such a modified level of fatty 
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acid saturation and oils produced from such oilseeds are 
further provided. 

DETAILED DESCRIPTIQW OF TOE T KVEMTIQM 

5 A plant desaturase of this invention includes any 

sequence of amino acids, such as a protein, polypeptide, or 
peptide fragment, obtainable from a plant source which is 
capable of catalyzing the insertion of a first double bond 
into a fatty acyl--ACP moiety in a plant host cell, i.e.. In 
10 vivo, or in a plant cell-like environment, i.e. in vitro. 
"A plant cell-like environment" means that any necessary 
conditions are available in an environment (i.e., such 
factors as temperatures, pH^ lack of inhibiting substances) 
which will permit the enzyme to function In particular, 
15 this invention relates to enzymes which add such a first 
double bond at the ninth carbon position in a fatty acyl- 
ACP chain. There may be similar plant desaturase enzymes 
of this invention with different specificities, such as the 
A-12 desaturase of carrot . 
20 Nucleotide sequences encoding desaturases may be 

obtained from natural sources or be partially or wholly 
artificially synthesized. They may directly correspond to 
a desaturase endogenous to a natural plant source or 
contain modified amino acid sequences, such as sequences 
25 which have been mutated, truncated, increased or the like. 
Desaturases may be obtained by a variety of methods, 
including but not limited to, partial or homogenous 
purification of plant extracts, protein modeling, nucleic 
acid probes, antibody preparations and sequence 
30 comparisons. Typically a plant desaturase will be derived 
in whole or in part from a natural plant source. 

Of special interest are A- 9 desaturases which are 
obtainable, including those with are obtained, from 
Cartharmus, Ricinus^ Siiamondsla^ or Brassica, for example 
35 C. tinctorius, R. communis, S. chinensis and B. campestris, 
respectively, or from plant desaturases which are 
obtainable through the use of these sequences. 
"Obtainable" refers to those desaturases which have 
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sufficiently similar 8eq[uences t:o that of the native 
sequences provided herein to provide a biologically active 
desaturase . 

Once a DNA sequence which encodes a desaturase is 
5 obtained, it may be employed as a gene of interest in a 

nucleic acid construct or in probes in accordance with this 
invention. A desaturase may be produced in host cells for 
harvest or as a means of effecting a contact between the 
desaturase and its substrata. Constructs may be designed 
10 to produce desaturase in either prokaryotic or eukaryotic 
cells. Plant cells containing recombinant constructs 
encoding biologically active desaturase sec[uences, both 
expression and anti-sense constructs, as well as plants and 
cells containing modified levels of desaturase proteins are 
15 of special interest. For use in a plant cell, constructs 
may be designed which will effect an increase or a decrease 
in amount of endogenous desaturase available to a plant 
cell transformed with such a construct . 

Where the target gene encodes an enzyme, such as a 
20 plant desaturase, which is already present in the host 

plant, there are inherent difficulties in analyzing mRNA, 
engineered protein or enzyme activity, and modified fatty 
acid composition or oil content in plant cells, especially 
in developing seeds; each of which can be evidence of 
25 biological activity. This is because the levels of the 

message, enzyme and various fatty acid species are changing 
rapidly during the stage where measurements are often made, 
and thus it can be difficult to discriminate between 
changes brought about by the presence of the foreign gene 
30 and those brought about by natural developmental changes in 
the seed. Where an expressed A-9 desaturase DNA sequence 
is derived from a plant species heterologous to the plant 
host into which the sequence is introduced and has a 
distinguishable DNA sequence, it is often possible to 
specifically probe for expression of the foreign gene with 
oligonucleotides complimentary to unicjue sequences of the 
inserted DNA/RNA. And, if the foreign gene codes for a 
protein with slightly different protein sequence, it may be 
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possible to obtain antibodies which recognize unique 
epitopes on the engineered protein. Such antibodies can be 
obtained by mixing the antiserum to the foreign protein 
with extract from the host plants or with extracts 
5 containing the host plant enzyme. For example, one can 

isolate antibodies uniquely specific to a C, tlnctorlus A- 
9 desaturase by mixing antiserum to the desaturase with an 
extract containing a Brassica A-9 desaturase. Such an 
approach will allow the detection of C. tinctor±us 
10 desaturase in Brasslca plants transformed with the C. 
tlnctorius desaturase gene. In plants expressing an 
endogenous gene in an antisense orientation, the problem is 
slightly different. In this case, there are no specific 
reagents to measure expression of a foreign protein. 
15 However, one is attempting to measure a decrease in an 
enzyme activity that normally is increasing during 
development. This makes detection of e3q>ression a simpler 
matter. In the final seed maturation phase, enzyme 
activities encoded by genes affecting oil composition 
20 usually disappear and cannot be detected in final mature 

seed. Analysis of the fatty acid content may be preformed 
by any manner known to those skilled in the art, including 
gas chromatography, for example. 

By increasing the amount of desaturase available in 
25 the plant cell, an increased percentage of unsaturated 

fatty acids may be provided; by decreasing the amount of 
desaturase, an increased percentage of saturated fatty 
acids may be provided. (Modifications in the pool of fatty 
acids available for incorporation into triglycerides may 
30 likewise affect the composition of oils in the plant cell.) 
Thus, an increased expression of desaturase in a plant cell 
may result in increased proportion of fatty acids, such as 
one or more of palmitoleate (C16:l), oleate (C18:l), 
linoleate (C18:2) and linolenate (C18:3) are expected. Of 
35 special interest is the production of triglycerides having 
increased levels of oleate. Using anti-sense technology, 
alternatively, a decrease in the amount of desaturase 
available to the plant cell is expected, resulting in a 
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higher percentage of sat:urates such as one or more of 
laurate (C12 :0) ^ myristate (C14 :0)f palmitate (C16:0)/ 
stearate (Cl8:0), arachldate (C20:0), behexnate (C22:0) and 
llgnocerate (C24:0). Of special interest: is the production 
5 of triglycerides having increased levels of stearate or 
palmitate and stearate- In addition, the production of a 
variety of ranges of such saturates is desired. Thus, 
plant cells having lower and higher levels of stearate 
fatty acids are contemplated • For example^ fatty acid 

10 compositions, including oils, having a 10% level of 

stearate as well as compositions designed to have up to a 
60% level of stearate or other such modified fatty acid(s) 
composition are contemplated. 

The modification of fatty acid compositions may also 

15 affect the fluidity of plant membranes. Different lipid 
concentrations have been observed in cold-hardened plants, 
for example. By this invention, one may be capable of 
introducing traits which will lend to chill tolerance. 
Constitutive or temperature inducible transcription 

20 initiation regulatory control regions may have special 
applications for such uses. 

Other applications for use of cells or plants 
producing desaturase may also be found. For example, 
potential herbicidal agents selective for plant desaturase 

25 may be obtained through screening to ultimately provide 
environmentally safe herbicide products. The plant 
desaturase can also be used in conjunction with chloroplast 
lysates to enhance the production and/or modify the 
coitqposition of the fatty acids prepared in vitro. The 

30 desaturase can also be used for studying the mechanism of 
fatty acid formation in plants and bacteria. For these 
applications, constitutive promoters may find the best use. 

Constructs which contain elements to provide the 
transcription and translation of a nucleic acid sequence of 

35 interest in a host cell are "expression cassettes". 

Depending upon the host, the regulatory regions will vary, 
including regions from structural genes from viruses, 
plasmid or chromosomal genes, or the like. For expression 
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in prokaryotic or eukaryotic microorganisms, particularly 
unicellular hosts, a wide variety of constitutive or 
regulatable promoters may be employed, itotong 
transcriptional initiation regions which have been 
5 described are regions from bacterial and yeast hosts, such 
as E. coll, B. subtllis^ Saccharomyces cerevisiae, 
including genes such as P-galactosidase, T7 polymerase, trp 
E and the like. 

A recombinant construct for expression of desaturase 
10 in a plant cell ("expression cassette") will include, in 
the 5' to 3' direction of transcription, a transcription 
and translation initiation control regulatory region (the 
transcriptional and translational initiation regions 
together often also known as a "promoter") functional in a 
15 plant cell, a nucleic acid sequence encoding a plant 
desaturase, and a transcription termination region. 
Numerous transcription initiation regions are available 
which provide for a wide variety of constitutive or 
regulatable, e.g., inducible, transcription of the 
20 desaturase structural gene. Among transcriptional 
initiation regions used for plants are such regions 
associated with cauliflower mosaic viruses (35S, 19S) , and 
structural genes such as for nopaline synthase or mannopine 
synthase or nap in and ACP promoters, etc. The 
25 transcription/translation initiation regions corresponding 
to such structural genes are found immediately 5* upstream 
to the respective start codons. Thus, depending upon the 
intended use, different promoters may be desired. 

Of special interest in this invention are the use of 
30 promoters which are capable of preferentially expressing 
the desaturase in seed tissue, in particular, at early 
stages of seed oil formation. Examples of such seed- 
specific promoters include the region immediately 5' 
upstream of napin or seed ACP genes, such as described in 
35 co-pending USSN 147,781, and the Bce-4 gene such as 

described in co-pending USSN 494,722. Alternatively, the 
use of the 5' regulatory region associated with an 
endogenous plant desaturase structural gene and/or the 
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transcription termination regions found iimnediately 3 ' 
downstream to the gene# may often be desired. 

In addition, for some applications, use of more than 
one promoter may be desired. For example, one may design a 
5 dual promoter expression cassette each promoter having a 
desaturase sequence under its regulatory control. For 
example, the combination of an ACP and napin cassette could 
be useful for increased production of desaturase in a seed- 
specific fashion over a longer period of time than either 

1 0 individua 1 ly . 

To decrease the amount of desaturase found in a plant 
host cell, anti-sense constructs may be prepared and then 
inserted into the plant cell. By "anti-sense" is meant a 
DNA sequence in the 5 * to 3 ' direction of transcription in 

15 relation to the transcription initiation region, which 
encodes a sequence complementary to the sequence of a 
native desaturase. It is preferred that an anti-sense 
plant desaturase sequence be complementary to a plant 
desaturase gene indigenous to the plant host. Sequences 

20 found in an anti-sense orientation may be found in 

constructs providing for transcription or transcription and 
translation of the DNA sequence encoding the desaturase, 
including expression cassettes. Constructs having more 
than one desaturase sequence under the control of more than 

25 one promoter or transcription initiation region may also be 
employed with desaturase constructs. Various transcription 
initia,tion regions may be enqoloyed. One of ordinary skill 
in the art can readily determine suitable regulatory 
regions. Care may be necessary in selecting transcription 

30 initiation regions to avoid decreasing desaturase activity 
in plant cells other than oilseed tissues . Any 
transcription initiation region capable of directing 
expression in a plant host which causes initiation of 
adequate levels of trsmscription selectively in storage 

35 tissues during seed development for example, should be 
sufficient. As such, seed specific promoters may be 
desired. Other manners of decreasing the amount of 
endogenous plant desaturase, such as ribozymes or the 
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screening of plauit cells transformed with constructs for 
rare events containing sense sequences which in fact act to 
decrease desaturase expression, are also contemplated 
herein. Other analogous methods may be applied by those of 
5 ordinary skill in the art. 

By careful selection of plants, transformants having 
particular oils profiles may be obtained. This may in part 
depend upon the qualities of the transcription initiation 
region (s) employed or may be a result of culling 

10 transformation events to exploit the variabilities of 
expression observed. 

In order to obtain the nucleic acid sequences encoding 
C. tinctorlus desaturase, a protein preparation free of a 
major albumin-type contaminant is required. As 

15 demonstrated more fully in the Examples, the protocols of 
McKeon and Stximpf, supra, result in a preparation 
contaminated with a seed storage protein. Removal of the 
protein contaminant may be effected by application of a 
reverse-phase HPLC, or alternatively, by application of a 

20 reduction and alkylation step followed by electrophoresis 
and blotting, for example. Other purification methods may 
be employed as well, now that the presence of the 
contaminant is confirmed and various properties thereof 
described. Once the purified desaturase is obtained it may 

25 be used to obtain the corresponding amino acid and/or 

nucleic acid sequences thereto in accordance with methods 
familiar to those skilled in the art. Approximately 90% of 
the total amino acid sequence of the C. tinctorlus 
desaturase is provided in Fig. 1 and in SEQ ID NOS: 1-11. 

30 The desaturase produced in accordance with the subject 

invention can be used in preparing antibodies for assays 
for detecting plant desaturase from other sources. 

A nucleic acid sequence of this invention may include 
genomic or cDNA sequence and mRNA. A cDNA sequence may or 

35 may not contain pre-processing sequences, such as transit 
peptide sequences. Transit peptide sequences facilitate 
the delivery of the protein to a given organelle and are 
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cleaved from the amino acid moiety upon entry into the 
organelle f releasing the "mature" sequence. 

In Fig. 2 and SEQ ID NO: 13, the sequence of the C. 
tinctojrlus desaturase precursor protein is provided; both 
5 the transit peptide and mature protein sequence are shown. 
Also provided in this invention are cDNA sequences relating 
to R. communis desaturase (Fig. 3 and SEQ ID NOS: 14-15), B. 
campestris desaturase (Fig, 4 and SEQ ID NOS: 17-19) and S. 
chlnesls (Fig.^ 5 and SEQ ID NOS: 43) . 

10 The use of the precursor cDNA sequence is preferred in 

desaturase expression cassettes. In addition, desaturase 
treuisit peptide sequences may be employed to translocate 
other proteins of interest to plastid organelles for a 
variety of uses, including the modulation of other enzymes 

15 related to the FAS pathway. See, European Patent 
Application Publication No. 189,707. 

As described in more detail below, the complete 
genomic seq[uence of a desaturase may be obtained by the 
screening of a genomic library with a desaturase cDNA probe 

20 and isolating those sequences which regulate esqpression in 
seed tissue. In this manner, the transcription, 
translation initiation regions and/or transcript 
termination regions of the desaturase may be obtained for 
use in a variety of DNA constructs, with or without the 

25 respective desaturase structural gene. 

Other nucleic acid sequences "homologous"* or "related" 
to DNA sequences encoding other desaturases are also 
provided. "Homologous" or "related" includes those nucleic 
acid sequences which are identical or conservatively 

30 substituted as compared to the exen^lified C. tlnctorlus, 
R, communis f S. chlnesls or B. campestris desaturase 
secjuences of this invention or a plant desaturase which has 
in turn been obtained from a plant desaturase of this 
invention. By conservatively substituted is meant that 

35 codon siabstitutions encode the same amino acid, as a result 
of the degeneracy of the DNA code, or that a different 
amino acid having similar properties to the original amino 
acid is substituted. One skilled in the art will readily 
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recognize that antibody preparations^ nucleic acid probes 
(DNA and RNA) sequences encoding and the like may be 
prepared and used to screen and recover desaturase from 
other plant sources. Typically, nucleic acid probes are 
5 labeled to allow detection, preferably with radioactivity 
although enzymes or other methods may also be used. For 
immunological screening methods, euitibody preparations 
either monoclonal or polyclonal are utilized. Polyclonal 
antibodies, although less specific, typically are more 
10 useful in gene isolation. For detection, the antibody is 
labeled using radioactivity or any one of a variety of 
second antibody /enzyme conjugate systems that are 
commercially available. Examples of some of the available 
antibody detection systems are described by Oberfilder 
15 (Focus (1989) BRL Life Technologies, Inc., 11:1-5)- 

A "homologous" or "related" nucleic acid sequence will 
show at least about 60% homology, and more preferably at 
least about 70% homology, between the known desaturase 
sequence and the desired candidate plant desaturase of 
20 interest, excluding any deletions which may be present. 
Homology is determined upon comparison of sequence 
information, nucleic acid or amino acid, or through 
hybridization reactions. Amino acid sequences are 
considered homologous by as little as 25% sequence identity 
25 between the two complete mature proteins. (See generally, 
Doolittle, R.F., of URFS and ORFS, University Science 
Books, CA, 1986.) 

Oligonucleotide probes can be considerably shorter 
than the entire sequence, but should be at least about 10, 
30 preferably at least about 15, more preferably at least 20 
nucleotides in length. When shorter length regions are 
used for comparison, a higher degree of sequence identity 
is required than for longer sequences. Shorter probes are 
often particularly useful for polymerase chain reactions 
35 (PGR) , especially when highly conserved sequences can be 

identified. (See, Gould, et al., PNAS USA (1989) 85:1934- 
1938.) Longer oligonucleotides are also useful, up to the 
full length of the gene encoding the polypeptide of 
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Interest. When longer nucleic acid fragments are employed 
(>100 bp) as probe Sf especially when, using compl te or 
large cDNA sequences, one would screen with low 
stringencies (for exan^le 40-50^C below the melting 

5 temperature of the probe) in order to obtain signal from 
the target san^le with 20-50% deviation;, i.e.^ homologous 
sequences. (See, Beltz, et ai.. Methods in Enzymology 
(1983) 100:266-285 •) Both DNA and RNA probes can be used. 
A genomic library prepared from the plant source of 

0 interest znay be probed with conserved sequences from a 
known desaturase to identify homologously related 
secnaences* Use of the entire cDNA may be employed if 
shorter pirobe secgpiences are not identified. Positive 
clones are then analyzed by restriction enzyme digestion 

5 and/or sequencing. When a genomic library is used^ one or 
more sequences may be identified providing both the coding 
region, as well as the transcriptional regulatory elements 
of the desaturase gene from such plant source. In this 
general manner, one or more sequences may be identified 

0 providing both the coding region, as well as the 

transcriptional regulatory elements of the desaturase gene 
from such plant source. 

In use, probes are typically labeled in a detectable 
manner (for exanqple with 32p-iabeled or biotinylated 

5 nucleotides) and are incubated with single-stranded DNA or 
RNA from the plant source in which the gene is sought, 
although unlabeled oligonucleotides are also useful* 
Hybridization is detected by means of the label after 
single-stranded and dotable-stranded (hybridized) DNA or 

0 DNA/RNA have been separated, typically using nitrocellulose 
paper or nylon membrames. Hybridization techniques suitable 
for use with oligonucleotides are well known to those 
skilled in the art. Thus, plant desaturase genes may be 
isolated by various techniques from any convenient plant. 

5 Plant desaturase of developing seed obtained from other 
oilseed plamts, such as soybean, coconut, oilseed rape, 
sunflower, oil palm, peanut, cocoa, cotton, corn and the 
like are desired as well as from non-traditional oil 
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sources, including but not limited to spinach chloroplast, 
avocado mesocarp, cuphea, California Bay, cucumber, carrot, 
meadowfoam, Oenothera and Euglena gracillls. 

Once the desired plant desaturase sequence is 
5 obtained, it may be manipulated in a variety of ways. 
Where the sequence involves non-coding flanking regions, 
the flanking regions may be subjected to resection, 
mutagenesis, etc. Thus, transitions, transversions, 
deletions, and insertions may be performed on the naturally 
10 occurring sequence. In addition, all or part of the 

sequence may be synthesized, where one or more codons may 
be modified to provide for a modified amino acid sequence, 
or one or more codon mutations may be introduced to provide 
for a convenient restriction site or other purpose involved 
15 with construction or expression. The structural gene may 
be further modified by employing synthetic adapters, 
linkers to introduce one or more convenient restriction 
sites, or the like. 

Recombinant constructs containing a nucleic acid 
sequence encoding a desaturase of this invention may be 
combined with other, i.e. "heterologous," DNA sequences in 
a variety of ways. By heterologous DNA sequences is meant 
any DNA sequence which is not naturally found joined to the 
native desaturase, including combinations of DNA sequences 
25 from the same plant of the plant desaturase which are not 
naturally found joined together. in a preferred 
embodiment, the DNA sequence encoding a plant desaturase is 
combined in a DNA construct having, in the 5' to 3» 
direction of transcription, a transcription initiation 
30 control region capable of promoting transcription in a host 
cell, and a DNA sequence encoding a desaturase in either a 
sense or anti-sense orientation. As described in more 
detail elsewhere, a variety of regulatory control regions 
containing transcriptional or transcriptional and 
35 translational regions may be employed, including all or 
part of the non-coding regions of the plant desaturase. 

The open reading frame coding for the plant desaturase 
or functional fragment thereof will be joined at its 5' end 
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to a transcription initiation regulatory control region. 
In some instances, such as modulation of plant desaturase 
via a desaturase in an anti-sense orientation, a 
transcription initiation region or transcription/ 
5 translation initiation region may be used. In embodiments 
wherein the expression of the desaturase protein is desired 
in a plant host, a transcription/ translation initiation 
regulatory region, is needed. Additionally, modified 
promoters, i.e., having transcription initiation regions 
10 derived from one gene source and translation initiation 
regions derived from a different gene source or enhanced 
promoters, such as double 35S CaMV promoters, may be 
employed for some applications . 

As described above, of particular interest are those 
15 5' upstream non-coding regions which are obtained from 

genes regulated during seed maturation, particularly those 
preferentially expressed in plant embryo tissue, such as 
ACP-and napin-derived tramscription initiation control 
regions. Such regulatory regions are active during lipid 
20 accumulation and therefore offer potential for greater 

control and/or effectiveness to modify the production of 
plant desaturase and/or modification of the fatty acid 
composition. Especially of interest are transcription 
initiation regions which are preferentially expressed in 
25 seed tissue, i.e., which are undetectcd:>le in other plant 

parts. For this purpose, the transcript initiation region 
of acyl carrier protein isolated from B. campestrls seed 
and designated as "Beg 4-4" and an unidentified gene 
isolated from B. campestrls seed arid designated as "Bce-4" 
30 are also of sxibstantial interest. 

Briefly, Bce4 is found in immature embryo tissue at 
least as early as 11 days after anthesis (flowering), 
peaking about 6 to 8 days later or 17-19 days post- 
anthesis, and becoming undetectable by 35 days post- 
35 anthesis. The timing of expression of the Bce4 gene 

closely follows that of lipid accumulation in seed tissue. 
Bce4 is primarily detected in seed embryo tissue and to a 
lesser extent found in the s ed coat. Bce4 has not been 
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detected in other plant tissues tested, root, stem and 
leaves • 

^^proximately 3.4 kb genomic sequence of Bce4 is 
provided in Fig. 8 and as SEQ ID NO: 21, including about 1 
5 kb 5* to the structural gene, about 0,3 kb of the Bce4 
coding gene sequence, and about 2.1 kb of the non-coding 
regulatory 3» sequence* Bce4 transcript initiation regions 
will contain at least 1 kb and more preferably about 5 to 
about 7 . 5 kb of sequence immediately 5 • to the Bee 4 

10 structural gene. 

The Beg 4-4 ACP message presents a similar expression 
profile to that of Bce4 and, therefore, also corresponds to 
lipid accumulation in the seed tissue. Beg 4-4 is not 
found in the seed coat and may show some differences in 

15 expression level, as compared to Bce4, when the Beg 4-4 5' 
non-coding sequence is used to regulate transcription or 
transcription and translation of a plant stearoyl-ACP 
desaturase of this invention. Genomic sequence of Beg 4-4 
is provided in Fig. 9 and as SEQ ID NO: 28, including about 

20 1.5 kb 5' to the structural gene, about 1.2 kb of the 

Beg 4-4 (ACP) structural gene sequence, and about 1.3 kb of 
the non-coding regulatory 3 • sequence . 

The napin 1-2 message is found in early seed 
development and thus, also offers regulatory regions which 

25 can offer preferential transcriptional regulation of a 
desired DNA sequence of interest such as the plant 
desaturase DNA sequence of this invention during lipid 
accumulation. Napins are one of the two classes of storage 
proteins synthesized in developing Brassica embryos 

30 (Bhatty, et al.. Can J. Biochem. (1968) 45:1191-1197) and 
have been used to direct tissue-specific ea^ression when 
reintroduced into the Brassica genome (Radke, et al., 
Theor. Appl. Genet. (1988) 75:685-694). Genomic sequence 
of napin 1-2 is provided in Fig. 10 and as SEQ ID NO: 2 9^ 

35 including about 1.7 kb 5' to the structural gene and about 
1.3 kb of the non-coding regulatory 3* secjuence 

Regulatory transcript termination regions may be 
provided in DNA constructs of this invention as well. 
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Transcript termination regions may be provided by the DNA 
sequence encoding the plant desaturase or a convenient 
transcription termination region derived from a different 
gene source, especially the transcript termination region 
5 which is naturally associated with the transcript 

initiation region. The transcript termination region will 
contain at least about 1 kb, preferably about 3 kb of 
sequence 3' to the structural gene from which the 
termination region is derived. 
10 In developing the DNA construct, the various 

components of the construct or fragments thereof will 
normally be inserted into a convenient cloning vector which 
is capable of replication in a bacterial host, e.g., E. 
coll. Numerous vectors exist that have been described in 
15 the literature . After each cloning, the plasmid may be 
isolated cuid subjected to further manipulation, such as 
restriction, insertion of new fragments, ligation, 
deletion, insertion, resection, etc., so as to tailor the 
components of the desired sequence. Once the construct has 
20 been conrpligted, it may then be transferred to an 

appropriate vector for further manipulation in accordance 
with the manner of transfoimation of the host cell. 

Normally^ included with the DNA construct will be a 
structural gene having the necessary regulatory regions for 
25 expression in a host and providing for selection of 

transformant cells. The gene may provide for resistance to 
a cytotoxic agent, e.g. antibiotic, heavy metal, toxin, 
etc . , complementation providing prototrophy to an 
auxotrophic host, viral immunity or the like. Depending 
30 upon the number of different host species into which the 

expression construct or components thereof are introduced, 
one or more markers may be employed, where different 
conditions for selection are used for the different hosts. 
The manner in which the DNA construct is introduced 
35 into the pleuat host is not critical to this invention. Any 
method which provides for efficient transformation may be 
enployed. Various methods for plant cell transformation 
include the use of Ti- or Ri-plasmids, microinjection. 
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electroporation, liposome fusion, DNA bombardment or the 
like. In many instances, it will be desirable to have the 
construct bordered on one or both sides by T-DNA^ 
particularly having the left and right borders, more 
5 particularly the right border. This is particularly useful 
when the construct uses A. timefaclens or A. rhlzogenes as 
a mode for trans format ion ^ although the T-DNA borders may 
find use with other modes of transformation. 

Where Agrohacterium is used for plant cell 
10 transformation, a vector may be used which may be 

introduced into the Agrobacterlum host for homologous 
recombination with T-DNA or the Ti- or Ri-plasmid present 
in the Agroi?acteriujn host . The Ti- or Ri-plasmid 
containing the T-DNA for recombination may be armed 
15 (capable of causing gall formation) or disarmed (incapable 
of causing gall formation) , the latter being permissible, 
so long as the vir genes are present in the transformed 
AgxoJbacterium host. The armed plasmid can give a mixture 
of normal plant cell and gall. 
20 A preferred method for the use of Agrobacterlum as the 

vehicle for transformation of plant cells employs a vector 
having a broad host range replication system, at least one 
T-DNA boundary and the DNA sequence or sequences of 
interest. Commonly used vectors include pKK2 or 
25 derivatives thereof. See, for example, Ditta et aJ., PNAS 
USA, (1980) 77:7347-7351 and EPA 0 120 515, which are 
incorporated herein by reference. Normally, the vector 
will be free of genes coding for opines, oncogenes and vlr- 
genes. Included with the expression construct and the T- 
DNA will be one or more markers, which allow for selection 
of transformed Agrobacterlum and transformed plant cells. 
A number of markers have been developed for use with plant 
cells, such as resistance to chloramphenicol, the 
aminoglycoside G418, hygromycin, or the like. The 
particular marker employed is not essential to this 
invention, one or another marker being preferred depending 
on the particular host and the manner of construction. 
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The expression constructs may be employed with a wide 
variety of plant life, particularly plant life involved in 
the production of vegetable oils. These plants include, 
but are hot limited to rapeseed, sunflower, C- tlnctorlusr 
5 cotton, Cuphea^ peanut, soybean, oil palm and corn. Anti- 
sense constructs may be employed in such plants which share 
complementarity between the endogenous secpience and the 
anti-sense desaturase. Of special interest is the use of 
an anti-sense construct having a B. campestris desaturase 
10 in rapeseed, including B. campestris and B. napus. 

For transformation of plant cells using Agjrojbacterium, 
esqplants may be combined and incubated with the transformed 
Agrrobacterlxim for sufficient time for transformation, the 
bacteria killed, and the plant cells cultured in an 
15 appropriate selective medium. Once callus forms, shoot 
formation can be encouraged by employing the appropriate 
plant hormones in accordance with known methods and the 
shoots transferred to rooting medium for regeneration of 
plants. The plants may then be grown to seed and the seed 
used to establish repetitive generations and for isolation 
of vegeteible oils compositions. A variety of stcQ^le 
genetic lines having fixed levels of saturation may be 
obtained and integrated into a traditional breeding 
program. Hemizygous and heterozygous lines or homozygous 
lines may demonstrate different useful properties for oil 
production and/or breeding. For exantple, saturation levels 
may be increased up to 2-fold by the development of 
homozygous plants as compared with heterozygous (including 
hemizygous) plants. 

The invention now being generally described, it will 
be more readily tinderstood by reference to the following 
examples which are included for purposes of illustration 
only and are not intended to limit the present invention. 

BXAMgLBS 

MATgRIALS 

Commercially available biological chemicals and 
chromatographic materials, including BSA, catalase (twice 
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crystalized from bovine liver) , spinach ferredoxin, 
ferredoxin-NADP+ oxidoreductase (spinach leaf) , nadph, 
unlabeled fatty acids, DEAE-cellulose (Whatman DE-52) CNBr- 
-t.vated Sepharose 4B, and octyl-Sepharose, and Reactive 
5 Blue Agarose are from Sigma (St. Louis, MO) 

Triethylamine, trichloroacetic acid, guanidine-HCl, and 
hydrazine-hydrate are also from Sigma. Proteolytic 
enzymes, including endoproteinases lysC, giuC, and aspN are 

10 '"'^r"^"' ^^^^^ Boehringer Mannheim 

10 (indianapolxs, IN) . Organic solvents, including acetone, 
acetonitrile, methanol, ether and petroleum ether are 
purchased from J.T. Baker (Phillipsburg, nj) ; concentrated 
acids and sodium sulfate are also from J.T. Baker 
(Phillipsburg, NJ) . HPLC grade acetonitrile and 
trifluoracetic acid (TFA) are obtained from Burdick and 
Jackson (Muskegon, MI), and from Applied Biosystems (Foster 
C:Lty, CA) respectively. Radiochemicals, including 
[9,10(n)-3H] Oleic acid (lOmCi/Mmol) and [3h] -iodoacetic 
acid (64Ci/mol) are from New England Nuclear (Boston, MA) . 
20 Phenacyl-8 Reagent (bromoacetophenone with a crown ether 
catalyst) used to prepare phenacyl esters of the fatty 
acids for analysis are from Pierce (Rockford, IL) C18 
reversed-phase thin-layer chromatography plates are from 
Whatman (Clifton, NJ) . 

25 Acyl carrier protein (ACP) and acyl-ACP synthase are 

isolated from E. aoli strain K-12 as described by Rock and 
Cronan (Rock and Cronan, Methods In Enzymol (1981) 71-341- 
351 and Rock et al.. Methods in Enzymol. (1981) 7^.397- 
403) . The E. aoli is obtainable from Grain Processing 
(Iowa) as frozen late-logarithmic phase cells. 

[9, 10 (n)-3H] stearic acid is synthesized by reduction 
Of [9,10(n)-3H3oleic acid with hydrazine hydrate 
essentially as described by Johnson and Gurr (Lipids (1971) 
^:78- 4) t9,10(n)-3H,oleic acid (2 mCi) , supplLnted 
with 5.58mg unlabeled oleic acid to give a final specific 
radioactivity of lOOmCi/mmol, is dissolved in 2ml of 
acetonitrile, acidified with 40^1 of glacial acetic acid, 
and heated to SS-C. Reduction is initiated with 100^1 of 
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60% (w/w) hydrazine hyGLra-te; oxygen is bubbled through the 
mixture continuously. After each hour acetonltrlle is 
added to bring the volume back to 2ml and an additional 
100^.1 of hydrazine hydrate is added. At the end of 5 hr. 
5 the reaction is stopped by addition of 3ml of 2M HCl. The 
reaction products are extracted with three 3ml aliquot s of 
petroleum ether and the combined ether extracts are washed 
with water/ dried over sodivim sulfate and evaporated to 
dryness. The dried reaction products are redissolved in 

10 1.0ml acetonitrile and stored at -20**C. The distribution 
of fatty acid products in a ISjll aliquot is determined by 
preparation of phenacyl esters, which are then analyzed by 
thin layer chromatography on C-18 reverse phase plates 
developed with methanol : water : 95 : 5 (v/v) . Usually 

15 reduction to [9, 10 (n) -^H] stearic acid is greater than 90%, 
a small amount of unreacted oleic acid may remain. The 
analysis is used to establish fraction of the total 
radioactivity that is present as stearate, and thereby to 
determine the exact substrate concentration in the enzyme 

20 assay. 

Acyl-ACP substrates, including [9, 10 (n) --^H] stearoyl- 
ACP are prepared and purified by the enzymatic synthesis 
procedure of Rock, Garwin, and Cronan {Methods in Enzymol. 
(1981) 72:397-403) . 
25 Acyl carrier protein was covalently bound to Sepharose 

4B by reaction of highly purified ACP with CNBr-activated 
Sepharose 4B as described by McKeon and Stumpf (J*. Biol. 
Chem. (1982) 257:12141-12147). 

30 BXMiPAe I 

In this example, an initial purification of C. 
tinctorius (saf flower) desaturase, following the method of 
McKeon and Stumpf (J". Biol. Chem. (1982) 257:12141-12142), 
is described. 

35 Assay: In. each of the following steps, the presence 

of the enzyme is detected radiometrically by measuring 
enzyme-catalyzed release of tritium from [9,10(n)- 
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3h] stearoyl-ACP . Preparation of this substrate is 
described in "Materials" above. 

The assay is performed by mixing 150^1 water, 5ml 
dithiothreitol (lOOmM, freshly prepared in water) , 10^1 
5 bovine serum albumin (lOmg/ml in water) , 15^1 NADPH {25mM, 
freshly prepared in O.IM Tricine-HCl, pH 8.2), 25^1 spinach 
ferredoxin (2mg/ml Sigma Type III in water), 3\il 
NADPH :ferredoxin oxidoreductase (2.5 units/ml from Sigma), 
and 1 111 bovine liver catalase (800,000 units/ml from 
10 Sigma) ; after 10 min at room temperature, this mixture is 
added to a 13x100 mm screw-cap test tube containing 250^11 
sodium 1, 4-piperazinediethanesulfonate (O.IM, pH 6.0). 
Finally, 10^1 of the sample to be assayed is added and the 
reaction is started by adding 30^1 of the substrate, 
15 [9,10(n)-3H]stearoyl-ACP (lOOilCi/^imol, lOpM in O.IM sodium 
1, 4-piperazinediethanesulfonate, pH 5.8). After sealing 
with a cap, the reaction is allowed to proceed for 10 min. 
while shaking at 23'»C. The reaction is terminated by 
addition of 1.2ml of 5.8% tricholoracetic acid and the 
resulting precipitated acyl-ACP's are removed by 
centrifugation. The tritium released into the aqueous 
supernatant by the desaturase reaction is measured by 
liquid scintillation spectrometry. One unit of activity is 
defined as the amount of enzyme required to convert lumol 
of stearoyl-ACP to oleoyl-ACP, or to release 4Hg-atoms of 
per minute. 

Source tissue: Developing C. tlnctorlus seeds from 
greenhouse grown plants are harvested between 16 and 18 
days after flowering, frozen in liquid nitrogen and stored 

30 at -70«>C until extracted. 

Acetone Powder: Approximately 50g of frozen seeds 
are ground in liquid nitrogen and sieved to remove large 
seed coat pieces to provide a fine embryo powder. The 
powder is washed with acetone on a Buchner funnel until all 

35 yellow color is absent from the filtrate. The powder is 

then air dried and further processed as described below, or 
may be stored frozen for at least a year at -70oc without 
loss of enzyme activity. 
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Acetone Powder Extract: The dried acetone powder is 
weighed and tritura:ted with ten times its weight of 20inM 
potassium phosphate, pH 6.8; the mixture is then 
centrifuged at 12,000 x g for 20 minutes and decanted 

5 through a layer of Miracloth (Calbiochem, La Jolla, CA) . 
Ion Exchange Chromatography: The acetone powder 
extract is then applied to a DEAE-cellulose column (Whatman 
DE-52) (1.5 X 12 cm) equilibrated with 20mM potassium 
phosphate, pH 6.8. The pass-through and a wash with one 

0 column-volume (20ml) of buffer are pooled. 

Affinity Chromatography: An affinity matrix for 
purification of the desaturase is prepared by reacting 
highly purified E. coli ACP, with CNBr- activated Sepharose 
4B (Sigma) i ACP (120mg) is reduced by treatment with ImM 

5 dithiothreitol for 30 min on ice, and then desalted on 
Sephadex G-10 (Pharmacia) equilibrated with 0 . IM sodium 
bicarbonate, pH 6.0. The treated ACP (20 ml, 6 mg/ml) is 
then mixed with 20ml of CNBr-activated Sepharose 4B swollen 
in O.IM soditim bicarbonate, pH 7.0, and the mixture is 

0 allowed to stand at 4**C for one day. The gel suspension is 
then centrifuged, washed once with O.IM sodium bicarbonate, 
pH 7.0, and then treated with 40ml O.IM glycine, pH 8.0, 
for 4 hours at room temperature to block unreacted sites. 
The gel is then washed for five cycles with alternating 

5 50ml volumes of 0.5M NaCl in O.IM sodium acetate, pH 4.0, 
and 0.5M NaCl in O.IM sodium bicarbonate, pH 6.5, to remove 
non-covalently bound ligand . The gel is loaded into a 
column (1.5 x 11.2 cm) and equilibrated in 20mM potassium 
phosphate, . pH 6.8. 

0 The combined fractions from the DE-52 column are 

applied to the column, which is subsequently washed with 
one column volume (20ml) of the equilibration buffer, and 
then with 2.5 column volumes (50ml) of 300mM potassium 
phosphate, pH 6,8. Fractions are assayed for protein using 

5 the BCA Protein Assay Reagent (Pierce, Rockford, ID to 
make sure that all extraneous protein has been eluted. 
Active A-9 desaturase is eluted from the column with 600mM 
potassium phosphate, pH 6.8. Active fractions are analyzed 
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by polyacrylamide gel electrophoresis in sodium dodecyl 
sulfate (SDS-PAGE) on 0.75inm thick 8 x 12 cm mini-gels 
according to the method of Laemmli (Nature (1970) 227:680) . 
The running gel contains 10% acrylamide in a 30/0.8 ratio 
5 of acrylamide to cross-linker bis-acrylamide . Those 

fractions containing a predominant band at approximately 43 
kD are pooled and stored frozen at -70**C until final 
purification ♦ The yield from 50g of seed tissue is is 
approximately 60^g of protein as measured by amino acid 
10 analysis. 

Further purification as described in Example 2 or 
Example 3 is then applied to the fractions pooled from the 
ACP-Sepharose column separation. 

15 Bataaple 2 

In this example, a protocol for the final purification 
of C. tinctorius desaturase is described. Seeds are 
treated in accordance with Example 1. 

Reverse-Phase HPLC: Fractions from the ACP-Sepharose 

20 column are pooled and applied to a Vydac C4 reverse-phase 
column (0.45 x 15 cm) ec[uilibrated in 0.1% TFA, 7% 
acetonitrile . After a 10 min wash with 0.1% TFA, the 
column is eluted with a gradient of increasing acetonitrile 
(7%-70% v/v) in 0.1% TEA over a period of 45 min. The flow 

25 rate is 0 . 5ml/min throughout . Eluting components are 

monitored by cd^sorbance at 214 nm. A* 9 desaturase elutes 
at about 42 min. (approximately 50% acetonitrile); the 
major contaminant protein remaining from ACP-affinity 
chromatography elutes at about 28 min. (approximately 30% 

30 acetonitrile) . The substantially homogeneous desaturase, 
which is no longer active, is Identified by SDS-PAGE, in 
which it exhibits a single band corresponding to a 
molecular weight of approximately 43 kD. The quantity of 
desaturase protein in the sample may be determined by amino 

35 acid analysis. 
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In this example, a protocol for the final purification 
of C. tlnctorlus desaturase is described. Seeds are 
treated in accordance with Example 1. 
5 Reduction and Alkylatlon: Protein is precipitated out 

of the pooled fraction solutions recovered from the ACP- 
Sepharose column with 10% (w/v) trichloroacetic acid, 
washed with cold (-^20^0 acetone, and resuspended in 1 ml 
500mM Tris-HCl, pH 8.6, containing 6M guanidine-HCl, lOmM 

10 EDTA, and 3.2 mM dithiothreitol . After 2 hours, 3.52 ^mol 
[^H]-iodoacetic acid (64)lCi/^mol, New England Nuclear) is 
added, and the reaction is allowed to proceed at room 
temperature in the dark for 2 hours, at which time the 
reaction is terminated by addition of l|ll (ISjlmol) fi- 

15 mercaptoethanol . The sample is then re-precipitated with 

10% (w/v) trichloroacetic acid, and the pellet again washed 
with cold (-20*^0 acetone and resuspended in Laemmli's SDS- 
sample buffer (Mature (1970) 227:680). 

SDS-PolyacrylamldB Gel Electrophoresis: The resulting 

20 saitqple is boiled for 5 min. and then applied to a 1.5 mm 
thick, 8 X 12 cm, SDS-polyacrylamide mini-gel prepared as 
described by Laemmli, supra. The running gel contains 
17.5% acrylamide in a 30:0.13 ratio of acrylamide to cross- 
linking bis-acrylamide . Separation is achieved by 

25 electrophoresis at 15 mA, for 2 hours at 4^C. 

Blottlng^ from SDS-gels to PVDF Membrane: Proteins are 
recovered from the gel by electroblotting at 5 mA/cm^ to a 
four-layer sandwich of polyvinylidenedif luoride (PVDF) 
membrane for 2 h at 4^C in a buffer containing lOmM CAPS 

30 ("3-Icyclohexylamino]-l-propane-sulfonic acid"), pH 11. 
The membranes must be wetted in 50% methanol, prior to 
exposure to the blotting buffer. After blotting, the 
membrane liayers are stained for 1-2 min. in 0.02% Coomassie 
Blue in 50% methanol, and then destained in 50% methanol. 

35 The desaturase is identified as a band corresponding to a 
molecular weight of about 43 kD; the major contaminant runs 
at or near the dye front of the gel corresponding to a 
molecular weight less than 20 kD. 
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The desaturase band on the PVDF membrane may be 
applied directly to the Edman sequencer (Applied Biosystems 
Model 477A) for determination of the N-terminal sequence of 
the intact protein^ or for more extensive sequence 
determination, may be eluted from the membrane in 40% 
acetonitrile to recover pure desaturase in solution, 
Acetonitrile is removed from the eluted desaturase by 
evaporation on a Speed-Vac (Savant; Farmingdale, NY) , and 
the substantially homogeneous A-9 desaturase is resuspended 
in an appropriate buffer for subsequent proteolytic 
digestion as described in Example 4 • The quantity of 
desaturase protein present may be determined by amino acid 
analysis . 

Alternatively, if the san^le is to be digested with 
trypsin or gluC protease to generate peptides for amino 
acid sequence analysis, proteins may be elect roblotted to 
nitrocellulose membranes and stained with Ponceau or amido 
black. 

20 Example A 

In this example, a method for the determination of the 
amino acid sequence of a desaturase is described. 

Reduction and Alkylation: Substantially homogenous 
stearoyl-ACP desaturase (See, Example 2) is reduced and 
25 alkylated with [3H]-iodacetic acid (See, Example 3), except 
that the final acetone-washed pellet is resuspended in the 
appropriate buffer for subsequent proteolysis. Reduction 
and alkylation assures complete denaturation of the protein 
so that complete proteolysis can occur. The sample may be 
30 alkylated with radiolabeled iodoacetamide or with 4- 
vinylpyridine instead of [^h] -iodacetic acid in 
substantially the same maimer. Use of iodoacetic acid 
affords an alkylated sample with greater solubility, which 
is advantageous in subsequent sample manipulation- 
's Proteolysis: Substsuitially pure alkylated samples 
are digested with endoproteinase lysC. The sample is 
resuspended in 100 [ll of 25 mM Tris-HCl, pH 8.8, containing 
1 mM EDTA. Endoproteinase lysC is added to the sample in a 
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protease/desaturase ratio of 1/50 (w/w) . Digestion is 
allowed to proceed at room texnperature for 8 hours, at 
which time another equal amount of protease is added. 
After 18 more hours, 1 \ll of concentrated HCl is added to 
5 stop proteolysis, emd the sample is applied directly to a 
Vydac CIS reverse-phase column (0.2 x 15 cm) equilibrated 
in 7% acetonitrile (v/v) in 0.1 mM sodium phosphate, pH 
2-2. After washing for 20 min with the equilibration 
buffer, peptides are eluted with a gradient in acetonitrile 

10 (7-70%, v/v) over 120 min. Flow rate is 50 Jll/min, 

throughout. Eluting components are monitored by absorbance 
at 214 nm, and individual peptide peaks are collected as 
separate fractions . The peptide fractions are further 
purified by application to a second Vydac C18 reverse-phase 

15 column (0.2 x 15 cm) equilibrated in 7% (v/v) acetonitrile 
in 0-1% (v/v) trif luoroacetic acid. Again, after a 20 min 
wash with equilibration buffer, the substantially pure 
peptides are eluted with a gradient (7-70%, v/v) of 
acetonitrile in 0.1% trif luoroacetic acid over 120 min. 

20 The flow rate is 50 Jll/min, throughout. Eluting components 
are monitored by absorbeunce at 214 nm, and individual 
peptide peaks are collected as separate fractions - These 
STibstantially pure peptides are applied directly to the 
Edman sequencer (Applied Biosystems, Model 477A) for amino 

25 acid sequence analysis. Alternatively, peptide fraction 
from the first HPLC purification in phosphate buffer, or 
from a single chromatography step in trif luoroacetic acid 
buffer, may be applied directly to the secpiencer, but these 
fractions, in meuiy cases, are not substantially pure and 

30 yield mixed or ambiguous sequence information. 

Other proteases may be used to digest desaturase, 
including but not limited to trypsin, gluC, amd aspN. 
While the individual digest buffer conditions may be 
different, the protocols for digestion, peptide separation, 

35 purification, and sequencing are substantially the same as 
those outlined for the digestion with lysC. Alternatively, 
desaturase may be digested chemically using cyanogen 
bromide (Gross Methods Enzymol (1967) 11:238-255 or Gross 
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and Wltkop J. Am. Chem. Soc. (1961) 83:1510), hydroxylamine 
(Bornstein and Balian Methods Enzymol. (1977) 47:132-745), 
iodosobenzoic acid (Inglis Methods Enzymol. (1983) 51:324- 
332), or mild acid (Fontana et ai.. Methods Enzymol. (1983) 
5 51:311-317), as described in the respective references. 

Fragments generated from these digestion steps of C. 
tlnctorius desaturase are presented in Fig. 1 and as SEQ ID 
NOS: 1-11. 

10 BXMftffle 5 

In this example, the preparation of a plant embryo 
cDNA bank, using the methods as described in Alexander, et 
al. {Methods In Enzymology (1987) 154:41-64) and the 
screening of the bank to obtain a desaturase cDNA clone is 

15 described. 

Ci ejtngtgritfg: A plant enibryo cDNA library may be 
constructed from poly (A) + RNA isolated from C. tlnctorius 
embryos collected at 14-17 days post-anthesis . Poly (A) + 
RNA is isolated from polyribosomes by a method initially 

20 described by Jackson and Larkins {Plant Physiol. (1976) 
57:5-10) as modified by Goldberg et al. {Developmental 
Biol. (1981) 83:201-217). 

The plasmid cloning vector pCGN1703, derived from the 
commercial cloning vector Bluescribe M13- (Stratagene 

25 Cloning Systems; San Diego, CA) , is made as follows. The 
polylinker of Bluescribe M13- is altered by digestion with 
BamHl, treatment with mung bean endonuclease, and blunt-end 
ligation to create a BamHI-deleted plasmid, pCGN1700. 
pCGN1700 is digested with EcoRl and Sstl (adjacent 

30 restriction sites) and annealed with synthetic 

complementary oligonucleotides having the sequences 
5* CGGATCCACTGCAGTCTAGAGGGCCCGGGA 3' (SEQ ID NO: 30) and 
5' AATTTCCCGGGCCCTCTAGACTGCAGTGGATCCGAGCT 3' (SEQ ID NO: 
31) . These sequences are inserted to eliminate the EcoRl 

35 site, move the BajriHI site onto the opposite side of the 
Sstl (also, sometimes referred to as "5acl" herein) site 
found in Bluescribe, and to include new restriction sites 
PstI, Xbal, Apal, Smal. The resulting plasmid pCGN1702, is 
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dlges'ted wltli HlndlXX and blxint- nded with Klenow enzyme; 
tlie linear DNA Is partially digested with Pvull and ligated 
with T4 DNA llgase In dilute solution. A transformant 
having the lac promoter region deleted Is selected 
5 (pCGN1703) and is used as the plasmid cloning vector. 

Briefly, the cloning method for cDNA synthesis is as 
follows . The plasmid cloning vector is digested with Sstl 
and homopolymer T-tails are generated on the resulting 3»- 
overheuig sticky-ends using terminal deoxynucleotldyl 

10 transferase. The tailed plasmid is separated from 

undigested or un-tailed plasmid by ollgo (dA) -cellulose 
chromatogj^aphy • The resultant vector serves as the primer 
for synthesis of cDNA first strands covalently attached to 
either end of the vector plasmid. The cDNA-mRNA- vector 

15 complexes are treated with terminal transferase in the 

presence of deoxyguanosdLne triphosphate, generating G-tails 
at the ends of the cDNA strands . The extra cDNA-mRNA 
complex, adjacent to the Bainaz site, is removed by BaitiEl 
digestion, leaving a cDNA-mRNA-vector complex with a BamHI 

20 sticky-end at one end and a G-tall at the other. This 

complex Is cycllzed using the annealed synthetic cyclizlng 
linker, 5*- 
GATCCGCGGCCGCGAATTCGAGCTCCCCCCCCCC-3 • and 
3 • -GCGCCGGCGCTTAAGCTCGA-5 « 

25 which has a Ba^zHI sticky-end and a C-tall end. Following 
ligation and repair the circular coxnplexes are transformed 
into B. coli strain DHSa (BRL; Galthersburg, MD) to generate 

the cDNA library. The C. tinctorius ezhbryo cDNA bank 
contains between 3x10^ and 5x10^ clones with an average 

30 cDNA insert size of approximately 1000 base pairs . 

ProJbe production Including PCR Reactions: Two regions 
of amino acid sequence (Example 4) with low codon 
degeneracy are chosen from opposite ends of peptide 
sequence "Fragment F2" (SEQ ID NO: 2) for production of a 

35 probe for the plant desaturase cDNA. Two sets of mixed 
oligonucleotides are designed and synthesized for use as 
forward (SEQ ID NOS: 21-24) and reverse (SEQ ID NOS: 25-26) 
primers, respectively, in the polymerase chain reaction 
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(Saiki et al.. Science (1985) 250:1350-1354; Oste, 
Blotechnigues (1988) 5:162-167). See, Fig, 6. All 
oligonucleotides are synthesized on an Applied Biosystems 
380A DNA synthesizer. 
5 Probes to C. tinctorius desaturase may be prepared 

using the peptide sequence "Fragment 2" (SEQ ID NO: 2) 
identified in Fig. 1. Four types of forward primers were 
synthesized and labeled 13-1, 13-2, 13-3, and 13-4 (SEQ ID 
NOS: 21-24, respectively). Two groups of reverse primers 

10 were synthesized and designated 13-5A and 13-6A (SEQ ID 

NOS: 25-2 6, respectively). The primer sequences are shown 
in Fig. 6. These oligonucleotide groups have a redundancy 
of 64 or less and contain either 20 or 17 bases of coding 
sequence along with flanking restriction site sequences for 

15 HindXZX or EcdRT. Based on the intervening amino acid 

sequence between the primer regions on peptide "Fragment 2" 
(SEQ ID NO: 2) the PGR product is expected to contain 107 
base pairs. 

Polymerase chain reaction is performed using the cDNA 

20 library DNA as template and the possible eight combinations 
of the four forward and two reverse oligonucleotides as 
primers in a Perkin-Elmer/Cetus DNA Thermal Cycler 
(Norwalk, CT) thermocycle file 1 min. 94®C, 2 min. 42**C, 2 
min rise from 42®-72®C for 30 cycles, followed by the step 

25 cycle file without step rises, 1 min. 94®C, 2 min. 42**C, 3 
min. 72**C with increasing 15 sec extensions of the 72®C 
step for 10 cycles, and a final 10 min. 72^C extension. 

The product of the 13-4 forward primer (SEQ ID NO: 24) 
and the 13-5A reverse primer (SEQ ID NO: 25) reaction was 

30 ethanol precipitated cmd then digested with Hindlll and 
£coRI, the resulting fragment was subcloned into pUC8 
(Vieira and Messing, Gene (1982) 15:259-268) . 
Minipreparation DNA (Maniatis et al., Molecular^Clonlng : A 
Laboratory Manual (1982) Cold Harbor Lcd^oratory, New York) 

35 of one clone was secjuenced by Sanger dideoxy sequencing 

(Sanger et ai., Proc. Nat. Acad. Sci. USA (1977) 74:5463- 
54 67) using the M13 universal and reverse primers. 
Translation of the resulting DNA sequence results in a 
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peptide sequence that exactly matches the amino acid 
sequence In peptide. "Fragment F2" (SEQ ID NO: 2). 

An exact 50 base oligonucleotide designated DESAT-50 
Is synthesized to match the sequence of the PGR reaction 
5 product from the first valine residue to the last tyrosine 
residue. 

The probe DSAT-50 5» - 
GTAAGTAGGTAGGGCTTCCTCTGTAATCATATCTCCAACCAAAACAACAA -3' (SEQ 
ID NO: 32) Is used to probe the C. tlnctorlus embryo cDNA 
10 library. 

Library screen 

The C- tlnctorius embryo cDNA bank Is moved into the 
cloning vector lambda gtlO (Stratagene Cloning Systems) by 

15 digestion of total cDNA with -BcoRI and ligation to lambda 
gtlO DNA digested with £coRI. The titer of the resulting 
library was -^SxlO^/ml. The library Is then plated on E» 
coll strain C600 (Huynh, et ai.^ DNA Cloning Vol. 1 Eds. 
Glover D.M. IRL Press Limited: Oxford England, pp. 56, 110) 

20 at a density of 5000 plaques/150 mm NZY ("NZYM" as defined 
in Maniatis et al. supra) agar plate to provide over 45,000 
placjues for screening. Duplicate lifts are taken of the 
plaques using NEN Colony Plaque Screen filters by laying 
precut filters over the plates for -1 minute and then 

25 peeling them off. The phage DNA is Immobilized by floating 
the. filters on denaturing solution (1.5M NaCl, .05M NaOH) 
for 1 mln., transferring the filters to neutralizing 
solution (1,5M NaCl, 0.5M Trls-HCI pH 8.0) for 2 mln, and 
then to 2XSSC (IxSSC 0.15M NaCl; 0.015M Na citrate) for 3 

30 mln., followed by air drying. The filters are hybridized 
with 32p end-labeled DSAT-50 oligonucleotide (SEQ ID NO: 
32) (BKL 5' DNA Terminus Labeling System) by the method of 
Devlin et al., {DNA (1988) 7:499-807) at 42** C overnight, 
and washed for 30 mln. at 50^0 in 2XSSC, 0.5% SDS and then 

35 twice for 20 mln. each at 50**C in 0. IXSSC, 0.5% SDS. 

Filters are exposed to X-ray film at -70®C with a Dupont 
Cronex intensifying screen for 48 hours. 
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Clones are detected by hybridization with the DSAT-50 
oligonucleotide and plaque purified. The complete 
nucleotide sequence (SEQ ID NO: 12) of the cDNA insert of a 
clone ^ pCGN2754, and a partial restriction map thereof are 
5 presented in Figures 2 and 7A, respectively. The cDNA 
insert includes 1533 bases plus a poly (A) track at the 3' 
end of 100-200 bases. The open reading frame for the 
desaturase begins at the first ATG (nucleotide 106) from 
the 5' end and stops at nucleotide 1294. The translated 

10 amino acid sequence is presented in Fig. 2 and SEQ ID NO: 
13 . The open reading frame includes a 33 amino acid 
transit peptide not found in the amino acid sequence of the 
mature protein. The N-terminus of the protein begins at 
the alanine immediately following the i\^coI site (nucleotide 

15 202) indicating the site of the transit peptide processing. 

In this example, expression of a plant desaturase in a 
prokaryote is described. 
20 Desaturase expression construct In E. coll 

A plasmid for expression of desaturase activity in E. 
coli is constructed as follows. The desaturase cDNA clone 
PCGN2754 is digested with HiiJdIII and Sail and ligated to 
PCGN2016 (a chlorait^henicol resistant version of Bluescript 
25 KS-) digested with fllndlll and Xhol. The resulting plasmid 
is PCGN1894. 

PCGN2016 is prepared by digesting pCGN565 with Hhal, 
and the fragment containing the chloramphenicol resistance 
gene is excised^ blunted by use of mung bean nuclease ^ and 

30 inserted into the £coRV site of Bluescript KB- (Stratagene: 
La Jolla, CA) to create pC(3N2008 . The choramphenicol 
resistance gene of pCGN2008 is removed by EcdRZ/H±n<lXZX 
digestion. After treatment with Klenow enzyme to blunt the 
ends, the fragment is ligated to DraZ digested Bluescript 

35 KS-. A clone that has the DraZ fragment containing 

ampicillin resistance replaced with the chloramphenicol 
resistance is chosen and named pCGN2016. 
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pC6N565 is a cloning vector based on pUC12-cm (K. Buckley 
Ph.D. Thesis, Regulation and expression of the phi X174 lysis 
gene. University of California, San Diego, 1985), but contains 
pUClS linkers (Yanisch-Perron, et al.. Gene (1985) 53:103-119) . 
5 The fragment containing the mature coding region of 

the A-9 desaturase, 3*-noncoding sequences and poly (A) 
tails is siabcloned from pCGN1894 digested with Ncol and 
Asp718 into pnci20, an E. coll expression vector based on 
pUClie (Vieira and Messing, Methods In Enzymology (1987) 

10 153:3-11) with the lac region inserted in the opposite 

orientation and an NcoT site at the ATG of the lac peptide 
(Vieira, J. PliD. Thesis, University of Minnesota, 1988) . 
The E. coli desaturase expression plasmid is designated 
pCGN3201, The desaturase sequences are inserted such that 

15 they are aligned with the lac transcription and translation 
signals . 

Expression of Desaturase ±n E.coli 

Single colonies of E. coll strain 7118 (Maniatis et 

20 ai,, supra) containing pUC120 or pCGN3201 are cultured in 
80 mis each of ECLB broth, 300 mg/L penicillin. The cells 
are induced by the addition of ImM IPTG. Cells are grown 
overnight (18 hrs) at 37° C, 

Eighty mis of overnight cultures of E. coll (induced 

25 and uninduced) containing pUC120 or pCGN3201 are 

centrifuged at 14,800 x g for 15 min. The pelleted cells 
are resuspended in 3 mis 20 mM phosphate buffer, pH 6.8. 
Resuspended cells were broken in a french press at 16,000 
psi. Broken cell mixtures are centrifuged 5000xg for 5 

30 min. 100 Jll of each supernatant is applied to a G-25 
Sephadex gel filtration centrifugal column (Boehringer 
Mannheim Biochemicals) , equilibrated in 20mM phosphate 
buffer pH 6.8. Columns are spun for 4 min at 5000xg. 
Effluent was collected and used as enzyme source in the 

35 desaturase assay. 

Desaturase activity is assayed as described in Example 
1. Both pUC120-containing, IPTG-induced cells emd 
uninduced cells do not express detectable stearoyl-ACP 
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desaturase activity. The pCGN3201 IPTG-induced extract 
contains 8-22 nmol/min of desaturase activity. pCGN3201 
uninduced extracts contains 6.45 nmol/min of activity. The 
pCGN3201 IPTG-induced extract shows 21.5% more activity 
5 than the uninduced pCGN3201 extract. 

Detection of induced protein in E. coli. 

Extracts of overnight cultures of E, coli strain 7118 
(Maniatis et al. supra ) containing pCGN3201 or pUC120 

10 grown in ECIjB containing 300 mg/L penicillin induced with 
ImM IPTG are prepared as follows . 1 . 5 ml of overnight 
culture grown shaking at 37**C are pelleted in Eppendorf 
tubes for 10 min at 10-13,000 }ig. Pellets are resuspended 
in 150 ul SDS sample buffer (0.05M Tris-HCl, pH6.8, 1% SDS, 

15 5% fi-mercaptoethanol, 10% glycerol and 0.005% bromophenol 
blue) and boiled for 10 min. 25 |Xl of each sample are 
electrophoresed on a 10% polyacryleimide gel (Laemmli, 
Nature (1970) 227:680) at 25 raA for 5 hours. Gels are 
stained in 0.05% Coomassie Brilliant Blue, 25% isopropanol 

20 and 10% acetic acid and destained in 10% acetic acid and 
10% isopropanol. A band is detected at a position just 
below the 43,000 MW protein marker (SDS PAGE standard. Low 
molecular weight, BioRad, Richmond CA) in the pCGN3201 
extracts that is not present in the pUC120 extracts. This 

25 is the approximate molecular weight of mature desaturase 
protein. 

Requirement for Spinach Ferredoxin 

Stearoyl-ACP desaturase can also be expressed in K. 

30 coli by subcloning into the E. coli expression vector 

pET8c (Studier, et aJ., Methods Enzymol. (1990) 185:60-89). 
The mature coding region (plus an extra Met codon) of the 
desaturase cDNA with accompanying 3 '-sequences is inserted 
as an Ncol - Sma 1 fragment into pETSc at the Ncol and 

35 BamHl sites (after treatment of the BaxtiRl site with Klenow 
fragment of DNA polymerase to create a blunt end) to create 
PCGN3208. The plasmid pCGN3208 is maintained in E. coli 
strain BL21(DE3) which contains the T7 polymerase gene 
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xinder the control of the isopropyl-b-D- 
thiogalactopyranoside (IPTG) -inducible lacXfVS promoter 
(Studier et al., supra). 

E. coll cells containing pCGN3208 are grown at 37®C to 
5 an OD595 of -0.7 in NZY broth containing 0.4% (w/v) glucose 
and 300 mg/liter penicillin, and then induced for 3 hours 
with 0.4 mM IPTG. Cells are pelleted from 1 ml of culture, 
dissolved in 125 ill of SDS sample buffer (10) and heated to 
100**C for 10 min. Samples are analyzed by SDS 
10 polyacrylamide gel electrophoresis at 25 mA for 5 h. Gels 
are stained in 0.05% Codmassie Brilliant Blue, 25% (v/v) 
isopropanol and 10% (v/v) acetic acid. A band is detected 
at a position just below the 43,000 MW protein marker (SDS 
PAGE standard. Low Molecular Weight, BioRad, Richmond, OA) 
15 in the pCGN3208 extract that is not present in the pET8c 
extracts. This is the approximate molecular weight of 
mature desaturase protein. 

For activity assays, cells are treated as described 
above and extracts are used as enzyme source in the 
20 stearoyl-ACP desaturase assay as described in Example 1. 

The extract from IPTG- induced pCGN3208 cells contains 8.61 
nmol/min/mg protein of desaturase activity. The extract 
from pC(3N3208 uninduced cells contains 1.41 nmol/min/mg 
protein of desaturase activity. The extract from pCGN3208 
25 induced cells, thus has approximately 6-fold greater 

activity th€ui the extract from uninduced pCGN3208 cells. 
Extracts from both induced and uninduced cells of pET8c do 
not contain detectable stearoyl-ACP desaturase activity. 
Samples are also assayed as described in Example 1, 
30 but without the addition of spinach ferredoxin, to 
determine if the E. coll ferredoxin is an efficient 
electron donor for the desaturase reaction. Minimal 
activity is detected in E. coll extracts unless spinach 
ferredoxin is added exogenously. 
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In this example, the preparation of an ACP expression 
cassette containing a plemt desaturase in a binary vector 
suitable for plant transformation is described. 

5 

ACP Expression Cassette 

An expression cassette utilizing S'-upstreeum sequences 
and 3 ' -downstream sequences obtainable from B. campestrls 
ACP gene can be constructed as follows . 

10 A 1.45kb Xhol fragment of Beg 4-4 (Fig. 9 and SEQ ID 

NO: 28) containing 5* -upstream sequences is subcloned into 
the cloning/sequencing vector Bluescript+ (Stratagene 
Cloning Systems, San Diego, CA) . The resulting construct, 
PCGN1941, is digested with Xhol and ligated to a 

15 chloramphenicol resistant Bluescript M13+ vector, pCGN2015 
digested with Xhol. pCGN2015 is prepared as described for 
PCGN2016 (See, Exaitple 6) except that the EcdRl/HindllX 
"chloramphenicol" fragment isolated from pCGN2008 is 
ligated with the 2273 bp fragment of Bluescript KS"*" 

20 (Stratagene; LaJolla, CA) isolated after digestion with 

DraZ. This alters the antibiotic resistance of the plasmid 
from penicillin resistance to chloranphenicol resistance. 
The chloraxnphenicol resistant plasmid is pCGN1953. 

3 '-sequences of Beg 4-4 are contained on an Sstl/Bglll 

25 fragment cloned in the Sstl/BaitiRl sites of M13 Bluescript+ 
vector. This plasmid is named pCGN1940. pCGN1940 is 
modified by In vitro site-directed mutagenesis (Adelman et 
ai., DNA (1983) 2:183-193) using the synthetic 
oligonucleotide 5 ' -CTTAAGAAGTAACCCGGGCTGCAGTTTTAGTATTAAGAG- 

30 3' (SEQ ID NO: 33) to insert SiaaX and PstI restriction 

sites immediately following the stop codon of the reading 
frame for the ACP gene 18 nucleotides from the SstI site. 
The 3'-noncoding sequences from this modified plasmid, 
pCGN1950, are moved as a PsI-Smal fragment into pCGN1953 

35 cut with PstI and Smal . The resulting plasmid pCGN1977 
comprises the ACP expression cassette with the unique 
restriction sites EcdRV, EcoRl and PstI available between 
the 1.45kb 5' and 1.5 kb of 3'-noncoding sequences (SEQ ID 
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NO: 28) for the cloning of genes to be expressed under 
regulation of these ACP gene regions. 

Desaturase Expression In Plants 
5 Desaturase cDNA sequences from pCGN2754 are inserted 

in the ACP expression cassette, pCGN1977, as follows . 
PC6N2754 is digested with fiindlll (located 160 nucleotides 
upstream of the start codon) and Asp718 located in the 
polylinker outside the poly (A) tails. The fragment 

10 containing the coding region for desaturase was blunt-ended 
using DNA polymerase I and ligated to pCGN1977 digested 
with EcoRV. A clone containing the desaturase sequences in 
the sense orientation with respect to the ACP promoter is 
selected and called pCGN1895. This expression cassette may 

15 be inserted into a binary vector, for example, for 

AgjroJbacteriuOT-mediated transformation, or employed in other 
plant transformation technicjues. 

Binary Vector and A^rol^acterlvm Transformation 

20 The fragment containing the pCGN1895 expression 

sequences ACP 5 Vdesaturase/ACP 3* is cloned into a binary 
vector PCGN1557 (described below) for Agrohacterlum 
transformation by digestion with Asp 7l8 and Xba l and 
ligation to pCGN1557 digested with Aaa718 and Xba T. The 

25 resulting binary vector is called pCGN18 98. 

PCGN1898 is transformed into AgroJbactejrlujn tumefaclens 
strain EHAlOl (Hood, etal., J. Bacterlol, (1986) 158:1291- 
1301) as per the method of Holsters, et al., Mol. Gen. 
Genet. (1978) 1 55: 181-187 . 

30 RNA blot analysis of seeds (T2) from Tl plants show^ 

the presence of a iitRNA species for the inserted C. 
tinctorlus desaturase, but the amount of message is low 
compared to endogenous levels of mRNA for the Brasslca 
desaturase, suggesting that the expression levels were 

35 insufficient to significantly increase the amount of 

desaturase enzyme above that normally present. This is 
consistent with the negative results from oil, desaturase 
activity and Western blot analyses. 
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PCGN1557 (McBride and Summerf elt , Plant Molecular 
Biology (1990) 14 (2) :269-27 6) is a binary plant 
5 transformation vector containing the left and right T-DNA 
borders of Agrobacterium tumefaciens octopine Ti-plasmid 
pTiA6 (Currier and Nester, supra, the gentamycin resistance 
gene of pPHlJI (Hirsch and Beringer, supra), an 
Agrotacterlum rhizogenes Ri plasmid origin of replication 
10 from pLJbBll (Jouanin et al., supra), a 35S promoter-kanR- 
tmlS' region capable of conferring kanamycin resistance to 
transformed plants, a ColEl origin of replication from 
PBR322 (Bolivar et al., supra), and a lacZ' screenable 
marker gene from pUC18 (Yanish-Perron et al., supra). 

There are three major intermediate constructs used to 
generate pCGN1557: 

PCGN1532 (see below) contains the pCGN1557 backbone, 
the pRi plasmid origin of replication, and the ColEl origin 
of replication, 

PCGN1546 (see below) contains the CaMV35S5 '-kan^-tmlS ' 
plant selectable marker region. 

pCGN1541b (see below) contains the right and left T- 
DNA borders of the A. tumefaciens octopine Ti-plasmid and 
the lacZ' region from pUC19. 
25 To construct pCGN1557 from the above plasmids,. 

PCGN154 6 is digested with Xhol, and the fragment containing 
the CaMV 35S5'-kanR-tml3' region is cloned into the Xhol 
site of pCGN1541b to give the plasmid pCGN1553, which 
contains T-DNA/left border/CaMV 35S5 '-kanR-tmlS • /lacZ • /T- 
30 DNA left border. pCGN1553 is digested with Bglll, and the 
fragment containing the T-DNA/left border/CaMV35S5 • -kan^- 
tml3Vlacz VT-DNA left border region is ligated into BajnHI- 
digested pCGN1532 to give the complete binary vector, 
PCGN1557 . 
35 pCgN1532 

The 3.5kb £:coRI-PstI fragment containing the 
gentamycin resistance gene is removed from pPhlJl (Hirsch 
and Beringer, Plasmid (1984) 12:139-141) by £;coRI-PstI 
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dig stion and cloned into EcdRl-PstX digested pUC9 (Vieira 
and Messing, Gene (1982) 2^:259-268) to generate pCGN549. 
Hindlll-PstX digestion of pCGN549 yields a 3.1 kb fragment 
bearing the geiitamycin resistance gene, which is made blunt 
5 ended by the Klenow fragment of DNA polymerase I and cloned 
into PvuII digested pBR322 (Bolivar et al.^ Gene (1977) 
2:95-113) to create pBR322Gm. pBR322Gm is digested with 
DraZ and Sphl, treated with Klenow enzyme to create blunt 
ends, and the 2.8 3cb fragment cloned into the Ri origin- 

10 containing plasmid pLJbBll (Jouanin et al., Mol. Gen. 

Genet. (1985) 201:370-374) which has been digested with 
^pal and made blunt-ended with Klenow enzyme, creating 
pLHbBllGm. The extra ColEl origin and the kajiamycin 
resistance gene are deleted from pLHbBllGm by digestion 

15 with BamHI followed by self closure to create pGmBll. The 
Hindi! site of pGmBll is deleted by HindlXl digestion 
followed by treatment with Klenow enzyme and self closure, 
creating pGmBll-H. The PstI site of pGmBll-H is deleted by 
PstI digestion followed by treatment with Klenow enzyme and 

20 self closure^ creating pCGN1532 . 

Construct ion of pC(5M1546 

The 35S promoter-tml3 ' expression cassette^ pCGN986, 
contains a cauliflower mosaic virus 35S (CaMV35) promoter 

25 and a T-DNA tml 3 '-region with multiple restriction sites 
between them. pC6N986 is derived from another cassette^ 
pCGN206, containing a CaMV35S promoter and a different 3* 
region, the CaMV region VI 3 '-end. The CaMV 35S promoter 
is cloned as an Alul fragment (bp 7144-7734) (Gardner et. 

30 al., Nucl. Acids Res. (1981) 5:2871-2888) into the Hindi 
site of M13mp7 (Messing,, et . al. f Nucl. Acids Res. (1981) 
5:309-321) to create C614. An £coRI digest of C614 
produced the EcoB.1 fragment from C614 containing the 35S 
promoter which is cloned into the EcdRl site of pUC8 

35 (Vieira and Messing, Gene (1982) 15:259) to produce 
PCGN147. 

pCGN148a containing a promoter region, selectc±>le 
marker (KAN with 2 ATG's) and 3' region, is prepared by 
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digesting pCGN528 with Bglll and inserting the BairiHI-Bglll 
promoter fragment from pCGN147. This fragment is cloned 
into the Bglll site of pCGN528 so that the Bglll site is 
proximal to the kanamycin gene of pCGN528. 
5 The shuttle vector used for this construct pCGN528, is 

made as follows: pCGN525 is made by digesting a plasmid 
containing Tn5 which harbors a kanamycin gene (Jorgenson 
et. al.^ Mol. Gen. Genet. (1979) 177:65) with Hindi II-BajnHI 
and inserting the Hindlll-BainHI fragment containing the 

10 kanamycin gene into the Hindlll-SamHI sites in the 
tetracycline gene of pACYC184 (Chang and Cohen, J. 
Bacterial. (1978) 134:1141-1156). pCGN526 was made by 
inserting the BainHI fragment 19 of pTiA6 (Thomashow et . 
al.. Cell (1980) 19:729-739), tnodified with Xhol linkers 

15 inserted into the Sjnal site, into the BajnHI site of 

PCGN525. PCGN528 is obtained by deleting the small Xhol 
fragment from pCGN526 by digesting with Xhol and 
religating. 

pCGN149a is made by cloning the BamHI-kanamycin gene 

20 fragment from pMB9KanXXI into the BsunRl site of pCGN148a. 
pMB9KanXXI is a pUC4K variant (Vieira and Messing, Gene 
(1982) 19:259-268) which has the Xhol site missing, but 
contains a functional kanamycin gene from Tn903 to allow 
for efficient selection in Agrobacterlum. 

25 pCGN149a is digested with Blndlll and BainHI and 

ligated to pUC8 digested with Bindlll and BajiiHI to produce 
PCGN169. This removes the Tn903 kanamycin marker, pCGN565 
(see PCGN2016 description) and pCGN169 are both digested 
with fllndlll and Pstl and ligated to form pCGN203, a 

30 plasmid containing the CaMV 35S promoter and part of the 
5 '-end of the Tn5 kanamycin gene (up to the Pstl site, 
Jorgenson et. al., (1979), supra). A 3 ' -regulatory region 
is added to pCGN203 from pCGN204 (an BcoRI fragment of CaMV 
(bp 408-6105) containing the region VI 3' cloned into pUC18 

35 (Yanisch-Perron, et ai.. Gene (1985) 53:103-119) by 
digestion with Hindlll and Pstl and ligation. The 
resulting cassette, pCGN206, is the basis for the 
construction of pCGN986. 
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The pTiA6 T-DNA tml 3 '-sequences are subcloned from 
the BamlS T-DNA fragment (Thomashow et al., (1980) supra) 
as a BauMl-EcdRT fragment (nucleotides 9062 to 12,823, 
numbering as in Barker et al.. Plant Mol. Biol. (1982) 
5 2:335-350) and coxnbined with the pACyC184 (Chang and Cohen 
(1978), supra) origin of replication as an ^coRI-Hindlll 
fragment and a gentamycin resistance marker (from plasmid 
pLB4I), obtained from D. Figurski) as a BamHI-Hindlll 
fragment to produce pCGN417 . 
10 The unique Smal site of pCGN417 (nucleotide 11,207 of 

the Bajnl9 fragment) is changed to a Sad site using linkers 
and the BanMlSacl fragment is subcloned into pCGN565 to 
give PCGN971. The BairiHI site of pCGN971 is changed to an 
^?coRI site using linkers. The resulting £coRI-SacI 
15 fragment containing the tml 3 • regulatory sequences is 

joined to pCGN206 by digestion with EcdRl and Sad to give 
pCGN975. The small part of the Tn5 kanamycin resistance 
gene is deleted from the 3 '-end of the CaMV 35S promoter by 
digestion with Sail and Bglll, blunting the ends and 
20 ligation with Sail linkers . The final expression cassette 
pCGN986 contains the CaMV 35S promoter followed by two Sail 
sites, an Xbal site, BamHI, Smal, Kpnl and the tml 3' 
region (nucleotides 11207-9023 of the T-DNA) . 

The 35S promoter-tml 3' expression cassette, pCGN986 
25 is digested with Hlndlll . The ends are filled in with 

Klenow polymerase and XhoX linkers added. The resulting 
plasmid is called pCGN986X. The BamHI-SacI fragment of 
pBRX25 (see below) containing the nitrilase gene is 
inserted into BamHI-SacI digested pCGN986X yielding pBRX66« 
30 Construction of pBRX25 is described in U.S. Letters 

Patent 4,810,648, which is hereby incorporated by 
reference. Briefly, the method is as follows: The 
nucleotide sequence of a 1212-bp Pstl-iJincII DNA segment 
encoding the bromoxynil-specif ic nitrilase contains 65-bp 
35 of 5' untranslated nucleotides. To facilitate removal of a 
portion of these excess nucleotides, plasmid pBRX9 is 
digested with PstI, and treated with nuclease BaJ31. BajrtHI 
linkers are added to the resulting ends. BajriHI-HlncII 
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fragments containing a functional bromoxynil gene are 
cloned into the BamRlSmal sites of pCGN565. The resulting 
plasmid, pBRX25, contains only 11 bp -of 5' untranslated 
bacterial sequence . 
5 pBRX66 is digested with PstI and EcoRl, blunt ends 

generated by treatment with Klenow polymerase, and Xhol 
linkers added. The resulting plasmid pBRX68 now has a tml 
3' region that is approximately l.lkb. pBRX68 is digested 
with Sail and Sad, blunt ends generated by treatment with 
10 Klenow polymerase and ^coRI linkers added. The resulting 
plasmid, pCGN986XE is a 35S promoter - tml 3' expression 
cassette lacking the nitrilase gene. 

The Tn5 kanamycin resistance gene is then inserted 
into PCGN986XE, The 1.0 kb EcoRl fragment of pCGN1536 (see 
15 PCGN1547 description) is ligated into pCGN98 6XE digested 

with EcdRl. A clone with the Tn5 kanamycin resistance gene 
in the correct orientation for transcription and 
translation is chosen and called pCGN1537b. The 35S 
promoter Kan^-tml 3' region is then transferred to a 
20 chloramphenical resistant plasmid backbone. pCGN786^ (a 

pUC-CAM based vector with the synthetic oligonucleotide 5' 
GGAATTCGTCGACAGATCTCTGCAGCTCGAGGGATCCAAGCTT 3' (SEQ ID NO: 
34) containing the cloning sites EcoRl, Sail, Bgill, PstI, 
Xhol, BaitiEl, and Hlndlll inserted into pCGN566, pCGN566 
25 contains the EcoHl-Hlndlll linker of pUC18 inserted into 
the JETcoKI-Hindlll sites of pUC13-cm (K. Buckler (1985) 
supra) ) is digested with Xhol and the Xhol fragment of 
pCGN1537b containing the 35S promoter - Kan^tml 3' region 
is ligated in. The resulting clone is termed pCGN1546. 

30 

pCgW1541h 

pCGN565RBa2X (see below) is digested with Sglll and 
Xhol, and the 728bp fragment containing the T-DNA right 
border piece and the lacZ' gene is ligated with Bg2II-XhoI 
35 digested pCGN65AKX-S+K (see below), replacing the Bglll- 
Xhol right border fragment of pCGN65AKX-S+K. The 
resulting plasmid, pCGN65a2X contains both T-DNA borders 
and the lacZ' gene. The Cial fragment of pCGN65a2X is 



48 ' - 

WO 91/13972 PCr/US91/01746 

replaced with an XhoX site by digesting with Clal blunting 
the ends using the Klenow fragment, and ligating with Xhol 
linker DNA, resulting in plasmid pCGN65a2XX. pC6N65a2XX 

is digested with BglJl and EcdRV, treated with the Klenow 
5 fragment of DNA polymerase I to create blunt ends, and 

ligated in the presence of Bglll linker DNA, resulting in 
pCGN65a2XX' . pCGN65a2XX» is digested with Bglll and 

ligated with Bg-ill digested pCGN1538 (see below), resulting 
in pCGN1541a, which contains both plasmid backbones. 
10 pCGN1541a is digested with Xhol and religated. Ampicillin 
resistant, chlormaphenicol sensitive clones are chosen, 
which lack the pACYC184-derived backbone, creating 
pCGN1541b. 

pCGN1538 is generated by digesting pBR322 with BcoRI 
15 and Pvnll, treating with Klenow to generate blunt ends, and 
ligating. with Bgrlll linkers. pCGN1538 is ampicillin 
resistant, tetracycline sensitive. 

20 pCGNSOl is constructed by cloning a 1.85 kb EcdRZ-Xhol 

fragment of pTiA6 (Currier and Nester, J". Bact. (1976) 
125:157-165) containing bases 13362-15208 (Barker et al.. 
Plant Mo. Biol. (1983) 2:335-350) of the T-DNA (right 
border) , into EcoKl-Sall digested M13mp9 (Vieira and 

25 Messfing, Gene (19B2) 15:259-268) . pCGN502 is constructed 
by cloning a 1.6 kb Hindlll-Smal fragment of pTiA6, 
containing bases 602-2212 of the T-DNA (left border) , into 
Hlndlll-Sinal digested M13mp9. pCGN501 and pCGN502 are both 
digested with EcdRl and Hindlll and both T-DNA-containing 

30 fragments cloned together into Hindi II digested pUC9 
(Vieira and Messing, Gene (1982) 15:259-268) to yield 
PCGN503, containing both T-DNA border fragments- pCGN503 
is digested with HindlTl and ^coRI and the two resulting 
Bindlll-^coRI fragments (containing the T-DNA borders) are 

35 cloned into £coRI digested pHC79 (Hohn and Collins, Gene 
(1980) 11:291-298) to generate pCGN518. The 1.6kb Kpnl- 
EcoRl fragment from pCGN518, containing the left' T-DNA 
border, is cloned into Kpnl-EcdRl digested pCGN565 to 
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generate pCGN580. The Sa/nHII-Bglli fragment of pCGN580 is 
cloned into the BamRl site of pACYC184 (Chang and Cohen, J. 
Bacteriol. (1978) 134:1141-1156) to create pCGNSl. The 1.4 
kb Bazmx-SphX fragment of pCGN60 containing the T-DNA right 
5 border fragment, is cloned into BamEl-Sphl digested pCGNSl 
to create pCGN65, which contains the right and left T-DNA 
borders . 

pCGN65 is digested with Kpnl and Xbal, treated with 
Klenow enzyme to create blunt ends, and ligated in the 
10 presence of synthetic Bglll linker DNA to create pCGN65AKX. 
PCGN65AKX is digested with Sail, treated with Klenow enzyme 
to create blunt ends, and ligated in the presence of 
synthetic Xhol linker DNA to create pCGN65AKX-S+X. 

15 PCGW565RBK2X 

PCGN451 (see below) is digested with Hpal and ligated 
in the presence of synthetic SphZ linker DNA to generate 
PCGN55. The Xhol-Sphl fragment of pCGN55 (bpl3800"15208, 
including the right border, of Agrobacterium tumefaclens T- 

20 DNA; (Barker et al.. Gene (1977) 2;95-113) is cloned into 
Sall-Sphl digested pUC19 (Yanisch-Perron et al.. Gene 
(1985) 53:103-119) to create pCGN60. The 1.4 kb ffindlll- 
BaitiEX fragment of pCGN60 is cloned into Hlndlll-^amHI 
digested pSP64 (Promega, Inc.) to generate pCGN1039. 

25 PCGN1039 is digested with Sinai and Nrul (deleting bpl4273- 
15208; (Barker et al.. Gene (1977) 2:95-113) and ligated in 
the presence of synthetic Bglll linker DNA creating 
PCGN1039ANS, The 0.47 kb EcoKl-HlndXU fragment of 
PCGN1039ANS is cloned into i^coRI-Hlndlll digested pCGN565 

30 to create pCGN565RB. The ifindlll site of pCGN565RB is 
replaced with an Xhol site by digesting with Hindlll, 
treating with Klenow enzyme, and ligating in the presence 
of synthetic Xhol linker DNA to create pCGN565RB-H+X. 

pUClS (Norrander et ai.. Gene (1983) 25:101-106) is 

35 digested with Haell to release the iacZ' fragment, treated 
with Klenow enzyme to create blunt ends, and the lacZ'- 
containing fragment ligated into pCGN5 65RB-H+X, which had 
been digested with AccI and SphX and treated with Klenow 
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enzyme in such a orientation that the lacZ* promoter is 
proximal to the right border, frjagment; this construct r 
pCGN565RBOC2x is positive for JacZ* expression when plated 

on an appropriate host and contains bp 13990-14273 of the 
5 right border fragment (Barker et al.. Plant Mo, Biol. 

(1983) 2:335-350) having deleted the AccI-SphI fragment (bp 
13800-13990) . 
PCSN451 

pCGN451 contains an ocs5'-ocs3* cassette, including 

10 the T-DNA right border, cloned into a derivative of pUC8 
(Vieira and Messing, supra) . The modified vector is 
derived by digesting pUCB with iTlncII and ligating in the 
presence of synthetic linker DNA, creating pCGN416, and 
then deleting the EcdRl site of pCGN416 by EcdRl digestion 

15 followed by treatment with Klenow enzyme and self -ligation 
to create pCGN42 6. 

The ocs5'-ocs3* cassette is created by a series of 
steps from DNA derived from the octopine Ti-plasmid pTiA6 
(Currier and Nester, supra) . To generate the 5* end, which 

20 includes the T-DNA right border, an EcoRX fragment of pTiA6 
(bp 13362-16202 (the numbering is by Barker, et al., (Plant 
Mol. B±o (1983) 2:335-350) for the closely related Ti 
plasmid pTil5955) ) is removed from pVK232 (Knauf and 
Nester, Plasmid (1982) 8:45) by EcdRl digestion and cloned 

25 into EcoRl digested pACYC184 (Chang and Cohen, supra) to 
generate pCGN15 . 

The 2.4kb BajnHI-J^coRI fragment (bp 13774-16202) of 
pCGN15 is cloned into EcoRl-BamEl digested pBR322 (Bolivar, 
et ai., supra) to yield pCGN429, The 412 bp EcoRl-Baniai 

30 fragment (bp 13362-13772) of pCGNlS is cloned into ^coRI- 
BamEl digested pBR322 to yield pCGN407 . The cut-down 
promoter fragment is obtained by digesting pCGN407 with 
Xmnl (bp 13512), followed by resection with Bal31 
exonuclease, ligation of synthetic EcoRl linkers, and 

35 digestion with Ba/riHI. Resulting fragments of approximately 
130 bp are gel purified and cloned into M13n5)9 (Vieira and 
Messing, supra) and sequenced. A clone, 1-4, in which the 
BcoRI linker has been inserted at bp 1362 between the 
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transcription initiation point and the translation 
initiation codon is identified by comparison with the 
sequence of de Greve, et ai., (J. Mol. Appl. Genet. (1982) 
1:499-512), The EcdRX cleavage site is at position 13639, 
5 downstream from the mRNA start 'site. The 141 bp EcdRl- 

BanELl fragment of 1-4, containing the cut-down promoter, is 
cloned into ScoRI-BajfnHI digested pBR322 to create pCGN428. 
The 141 bp ficoRI-BamHI promoter piece from pCGN428, and the 
2.5 kt BcoRI-BainHI ocs5' piece from pCGN429 are cloned 

10 together into J5?coRI digested pUC19 (Vieira and Messing, 

supra) to generate pCGN442, reconstructing the ocs upstream 
region with a cut -down promoter section. 

To generate the ocs3' end, the Hindlll fragment of 
pLB41 (D. Figurski, UC San Diego) containing the gentamycin 

15 resistance gene is cloned into Hindlll digested pACyC184 
(Chang and Cohen, supra) to create pCGN413b. The 4.7 kb 
BaiuHI fragment of pTiA6 {supra), containing the ocs3' 
region, is cloned into BainHI digested pBR325 (F, Bolivar, 
Gene (1978) 4:121-136) to create 33c-19. The Sjnal site at 

20 position 11207 (Barker, supra) of 33c-19 is converted to an 
XhoT site using a synthetic XhoX linker, generating 
PCCG401.2. The 3.8 kb BajnHI-^TcoRI fragment of pCGN401.2 is 
cloned into BajnHI-JScoRI digested pCGN413b to create 
PCGN419. 

25 The ocs5'-ocs3' cassette is generated by cloning the 

2.64 kb EcdRl fragment of pCGN442, containing the 5' 
region, into -BcoRI digested pCGN419 to create pCNG446. The 
3.1kb Xhol fragment of pCGN446, having the ocs5' region (bp 
13639-15208) and ocs3 ' region (bp 11207-12823), is cloned 

30 into the Xhol site of pCGN426 to create pCGN451. 

Example g 

In this example, the preparation of a Bce-4 expression 
cassette containing a plant desaturase is described. 
35 The desaturase cDNA clone from pCGN2754 prepared as 

described in Example 5, is modified by in vitro mutagenesis 
to insert restriction sites immediately upstream of the ATG 
start codon and downstream of the TGA stop codon. A 
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single-s-tranded template DNA is prepared for the 
mutagenesis reaction from pCGN1894 (described in Example 6) 
as described by Messing, (MBthods in Enzymol, (1983) 
101:20-79). Synthetic oligonuclieotides are synthesized on 
5 an Applied Biosystems 380A DNA synthesizer. The 
oligonucleotides used are 

5 ' -CCATTTTTGATCTTCCTCGAGCCCGGGCTGCAGTTCTTCTTCTTCTTG-3 • 
(SEQ ID NO: 35) for the 5 'mutagenesis and 

5 • -GCTCGTTTTTTTTTTCTCTGCAGCCCGGGCTCGAGTCACAGCTTCACC -3 • 

10 (SEQ ID NO: 36) for the 3 •-mutagenesis; both add PstI, Smal 
and Xhot sites flanking the coding region. Both 
oligonucleotides are 5 ' -phosphorylated (BRL 5'-Te2minus 
labelling kit) and used for mutagenesis with the pCGN1894 
template by the procedure of Adelman et al. {DNA (1983) 

15 2:183-193) • Alternatively, the desired restriction sites 
may be inserted by PCRr using the 3* oligo described above 
(SEQ ID NO: 36) and another oligor 

5' ACTGACTGCAGCCCG6GCTCGAGGAAGATCAAAAATGGCTCTTC 3' (SEQ ID 
NO: 37) for the 3' and 5' primers, respectively- The 

20 template in this polymerase chain reaction is DNA from 

pCGN1894. The Xhol fragment from the resulting clone can 
be subcloned into the Bce4 expression cassette, pCGN1870 
(described below) at the unique Xhol site. This 
Bce4/desaturase expression cassette can then be inserted in 

25 a suitable binary vector, transformed into Agrobacterixm 
tumefaclens strain EHAlOl and used to transform plaints as 
provided in Example 10. 

BGe-4 Expression Cassette 

30 pCGN1870 is a Bce-4 expression cassette containing 5' 

and 3 ' regulatory regions of the Bce-4 gene and may be 
derived from the Bce-4 sequence found in pCGN1857, which 
was deposited with the ATCC on March 9, 1990, and assigned 
accession number 68251, or by methodis known to one skilled 

35 in the art from the sequence (SEQ ID NO: 27) provided in 
Fig. 8. The Bee 4 gene may be isolated as follows: 

The Clal fragment of pCGN1857, containing the Bce4 
gene is ligated into Clal digested Bluescript KS+ 
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(Stratagene; La Jolla, CA) ^ producing pCGN1864. Single 
stranded DNA is made from pCGN1864 and altered by in vitro 
mutagenesis using the oligonucleotides 

BCE45P: 

5 ( 5 • GAGTAGTGAACTTCATGGATCCTCGAGGTCTTGAAAACCTAGA3 ' ) (SEQ ID 
NO: 38) and 

BCE43P: 

( 5 • CAATGTCTTGAGAGATCCCGGGATCCTTAACAACTAGGAAAAGG3 ' ) ( SEQ ID 
NO: 39) 

10 as described by Adelman et al. (DNA (1983) 2:183-193). The 
oligonucleotide BSCP2 ( 5 ' GTAAGACACGACTTATCGCCACTG3 ' ) (SEQ 
ID NO: 40), complementary to a portion of Bluescript, is 
included in the reaction to improve the yield of double- 
stranded DNA molecules. The resulting plasmid, pCGN1866, 
15 contains Xhol and Ban&l sites (from BCE45P) immediately 5' 
to the Bce4 start codon and BajriHI and Sinai sites (from 
BCE43P) immediately 3" to the Bce4 stop codon. The Clal 
fragment of pCGN18 66, containing the mutagenized 
sequences, is inserted into the Clal site of pCGN2016 
20 (described in Example 6), producing pCGN1866C. The Clal 

fragment of pCGN1866C is used to replace the corresponding 
wild-type Clal fragment of PCGN1867 (described below) to 
produce pCGN1868 . Bce4 coding sequences are removed by 
digestion of pCGN1868 with BairiHI and recircularization of 
25 the plasmid to produce pCGN1870. The Bce4 expression 
cassette, pCGN1870, contains 7.4 kb of 5» regulatory 
sequence and 1.9 kb of 3* regulatory sequence derived from 
the Bce4 genomic clone separated by the cloning sites, 
Xhol, BamHI, and 5inal. Desaturase secg[uences in sense or 
anti-sense orientation may be inserted into the cassette 
via the cloning sites and the resulting construct may be 
employed in a plant transformation technique. 
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The BaitiEl and Smal sites of pUC18 are removed by 
BajriHI-Smal digestion and recircularizing of the plasmid, 
without repair of the ends, to produce pCGN1862 The PstI 
5 fragment of pCGN1857, containing the Bce4 gene, is inserted 
into the Pstl site of pCGN1862 to produce pCGN1867. 

BXMIPle 2 

In this example , the preparation of a napin 1-2 
10 expression cassette containing a plant desaturase is 
described. 

Preparation of Desaturase Clone 

The desaturase cDNA clone from pCGN2754 is prepared 
15 and modified as described in Example 8. The Xhol fragment 
from the resulting clone can be subcloned into the napin 1- 
2 escpression cassette, pCGN1808 (described below) at the 
unique Xhol site. This napin 1-2 /desaturase expression 
cassette can then be inserted into a suitable binary 
vector, transformed into A. tujnei'aclens strain EHAlOl in a 
like manner as described in Example 7 . 

Alternatively, the desaturase safflower clone may be 
prepared such that restriction sites flank the translation 
start and stop sites, as described in Example 8, with the 
following modification. PGR was carried out according to 
manufacturer • s instructions except for the Initial 
annealing of the oligonucleotides to the template. The 
reaction mix was heated to 90^0 for 5 min, cooled to 37^0 
over a one hour period, kept at 37^0 for 20 min and then 
subjected to standard PGR cycles. The PGR product was 
digested with Pstl and ligated to pUG8 (Vieira and Messing 
(1982) Gene 15:2359-268) digested with Pstl to produce 
PGGN3220. The Ncol/SacZ fragment of pGGN3220 containing 
the pUGS vector and the 5' and 3' sequences of the 
safflower desaturase cDNA was gel purified and ligated to 
the gel-purified cloned Ncol/Sacl fragment from pCGN1894 
(see Example 6) . The resulting plasmid pGGN3222 contains 
safflower desaturase cDNA sequences partially from the cDNA 
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clone and partially from the PGR. The regions obtained 
from the PGR were confirmed by DNA sequencing as being 
identical to the original cloned sequence. 

5 Expression Cassettes 

NaPin 1-2 dCGN1808 Exprftssinn r.afi.^^i-^^ 

An expression cassette utilizing 5' upstream sequences 
and 3» dovmstream sequences obtainable from B. campestris 
napin gene can be constructed as follows. 
10 A 2.7 Icb Xhol fragment of napin 1-2 (Fig. 10 and SEQ 

ID NO: 29) containing 5' upstream sequences is subcloned 
into PCGN789 (a pUG based vector the same as pUG119 with 
the normal polylinker replaced by the synthetic linker - 
5'GGAATTCGTCGAGAGATCTGTGCAGGTGGAGGGATCGAAGGTT 3', SEQ ID 
15 NO: 41, (which represented the polylinker EcdRlr Sail, 

BgllX, Pstl, XhoXr BajnHI, Hindlll) and results in pCGN940. 
The majority of the napin coding region of pCGN940 was 
deleted by digestion with Sail and religation to form 
PCGN1800. Single-stranded DNA from pCGNlSOO was used in an 
20 In vitro mutagenesis reaction (Adelman et ai . ^ DNA (1983) 
2:183-193) using the synthetic oligonucleotide 5' 
GCTTGTTCGCCATGGATATCTTCTGTATGTTC 3', SEQ ID NO: 42. This 
oligonucleotide inserted an JScoRV and an Ncol restriction 
site at the junction of the promoter region and the ATG 
25 start codon of the napin gene. An appropriate mutant was 

identified by hybridization to the oligonucleotide used for 
the mutagenesis and sequence analysis and named pCGNlSOl. 

A 1.7 kb promoter fragment was subcloned from pCGNlSOl 
by partial digestion with EcoRV and ligation to pCGN786 (a 
30 PCGN566 chloramphenicol based vector with the synthetic 
linker described above in place of the normal polylinker) 
cut with EcoRl and blunted by filling in with DNA 
Polymerase I Klenow fragment to create pCGN1802. 

A 2.1 kb SalX fragment of napin 1-2 (Fig. 10 and SEQ 
35 ID NO: 29) containing 3* downstream sequences is subcloned 
into PCGN789 (described above) and results in pCGN941. 
PCGN941 is digested with Xhol and Hindlll and the resulting 
approximately 1.6 kb of napin 3' sequences are inserted 
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into Xhoi-Hlndltl digested pCGN18 02 to result in pCGN1803; 
In ordei: to remove a 326 nucleotide Hlndlll fragment 
inserted opposite to its natural orientation^ as a result 
of the fact that there are 2 Hindlll sites in pCGN1803, the 
5 pCGNlBOa is digested with HindllX and religated. Following 
religation, a clone is selected which now contains only 
1.25 kb of the original 1.6 napin 3' sequence. This clone, 
pCGNlBOB is the napin 1-2 expression cassette and contains 
1.725 kb of napin promoter sequences and 1.265 kb of napin 
10 3' sequence with the unique cloning sites Sail, Bgll, PstI 
and Xhol in between. 

Nap in 1-2 pCGN3223 Express inn Cass^tt^ 

Alternatively, pCGNlSOS may be modified to contain 

15 flanking restriction sites to allow movement of only the 
expression sequences and not the antibiotic resistance 
marker to binary vectors such as pCGN1557 (McBride and 
Summerfelt, supra) . Synthetic oligonucleotides containing 
Kpnl, Notl and HindlXX restriction sites are annealed and 

20 ligated at the unique Hindlll site of pCGNlBOB, such that 
only one Hindlll site is recovered. The resulting plasmid, 
pCGN3200 contains unique Hindlll, NotX and Kpnl restriction 
sites at the 3 ' -end of the napin 3 * -regulatory seqn^ences as 
confirmed by sequence analysis. 

25 The majority of the napin expression cassette is 

subcloned from pCGN3200 by digestion with Hlndlll and Sad 
and ligation to filndlll and 5acl digested pIC19R (Marsh, et 
al. (19B4) Gene 32:4B1-4B5) to make pCGN3212. The extreme 
5 '-sequences of the napin promoter region are reconstructed 

30 by PGR using pCGN3200 as a template and two primers 

flanking the Sad site and the junction of the napin 5'- 
promoter and the pUC backbone of pCGN3200 from the pCGNlBOB 
construct. The forward primer contains Clal, HindllX, 
Notl, and JCpnl restiction sites as well as nucleotides 40B- 

35 423 of the napin 5 '-sequence (from the EcdRV site) and the 
reverse primer contains the complement to napin sequences 
718-739 which include the unique Sad site in the 5'- 
promoter. The PGR was performed using a Perkin Elmer/Cetus 
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thermocycler according to manufacturer's specifications. 
The PGR fragment is subclon d as a blunt-ended fragment 
into pUC8 (Vieira and M ssing (1982) Gene 15:259-268) and 
digested with Hindi to give pCGN3217. Sequence of 
5 PCGN3217 across the napin insert verifies that no improper 
nucleotides were introduced by PGR. The napin 5-sequences 
in PGGN3217 are ligated to the remainder of the napin 
expression cassette by digestion with Clal and SacX and 
ligation to pCGN3212 digested with Clal and Sad. The 
10 resulting expression cassette pCGN3221^ is digested with 

Hindi II and the napin expression sequences are gel purified 
away and ligated to pIG20H (Marsh, supra) digested with 
Hindlll. The final expression cassette is pCGN3223, which 
contains in an ampicillin resistant background, essentially 
15 identical 1.725 napin 5» and 1.265 3' regulatory sequences 
as found in pGGNlBOS. The regulatory regions are flanked 
with ffindlll, l^otl and KpnX restriction sites and unique 
5aII, J3giII, PstI, and Xhol cloning sites are located 
between the 5* and 3" noncoding regions. 

Desaturase sequences in sense or anti-sense 
orientation may be inserted into a napin expression 
cassette via the cloning sites. The resulting construct 
may be employed for plant transformation. For example, one 
of ordinary skill in the art could also use known 
techniques of gene cloning, mutations, insertion and repair 
to allow cloning of a napin expression cassette into any 
suitable binary vector, such as pGGN1557 (described in 
Exainple 7) or other similar vectors. 

Desaturase Expression 

The coding region of the safflower desaturase 
contained in pCGN3222 is cloned into the pGGN3223 napin 
cassette by digestion with Xhol and ligation to pGGN3223 
digested with Xhol aind Sail, The resulting plasmid, 
PCGN3229 is digested with Asp718 and inserted in the binary 
vector PGGN1578 (McBride and Summerfelt (1990) Plant Mol. 
Biol. 141269-216) at the unique Asp718 site. The resulting 
binary vector is pCGN3231 and contains the safflower 
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desaturase coding sequences flanked by the napin 5 ' and 3 ' 
regulatory , s qaences as well as. the plauit selectable marker 
construct^ 353/NPTII/tml. 

The resulting binary vector r pCGN3231, is transformed 
5 into AgrroJbacterlujn and utilized for plant transformation as 
described in Example 10. For Northern analysis,^ total RNA 
is isolated from day 21 and day 28 post-anthesis developing 
seed from plants transformed with pCGN3231, Five samples 
were emalyzed at day 21 and two at day 28 post-anthesis . 

10 RNA was isolated by the method of Hughes and Galau {Plant 
Mol. Biol, Reporter (1988) 6: 253-257) • Northern blot 
analysis was performed using a labeled 0.8 kb Bglll 
fragment of pCGN1894 as a probe. Prehybridization and 
hybridization was at 42^C in 50% formamide,. lOX Denhardt's 

15 solution, 5X SSC, 0.1% SDS, 5mM EDTA and lOOug/ml denatured 
salmon sperm DNA. Filters were washed at 55^C in 0.1 X 
SSC, 0.1% SDS. Under these conditions, the probe does not 
hybridize to the endogenous Brasslca desaturase gene 
seG[uences. mRNA complementary to the saf flower desaturase 

20 was detected in all the treucisgenic samples examined. More 
mRNA was present at day 28 than at day 21 post-anthesis and 
the highest level of RNA was seen in transgenic 3231-8. 
The total safflower desaturase mRNA level was estimated to 
be --0.01% of the message at day 28 post-anthesis. 

25 Western analysis (see below) gives a preliminary 

indication of increased protein in one trcuisformant, 3231- 
8. However, the Western analysis is con^licated by two 
factors: 1. The presence of cross -reacting material at the 
same molecular weight as expected for the safflower 

30 desaturase. We believe this material is the endogenous 

Brasslca desaturase. 2 . The analysis of levels of protein 
expressed is also complicated by the normal developmental 
increase in the expression of desaturase protein during 
this time period. If the seeds examined are not at the 

35 precise developmental stage as the control seeds, 

quantitative differences in the amount of material seen may 
be simply due to the normal increase in the Brasslca 
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desaturase over this time period and not due to the 
expression of the saf flower desaturase. 

5 Soluble protein is extracted from developing seeds of 

Brasslca by homogenization with one voliime (Iml/gram fresh 
weight) of buffer containing 20mM potassium phosphate, pH 
6.8. The homogenate is clarified by centrifugation at 
12,000 X gr for 10 minutes. A second centrifugation is 
10 performed if necessary to provide a non-particulate 
supernatant . 

Protein concentration of the extract is measured by 
the micromethod of Bradford (Anal. Biochem. (1976) 72:248- 
254) . Proteins (20-60jlg) are separated by denaturing 
15 electrophoresis by the method of Laemmli (supra) , and are 
transferred to nitrocellulose membrane by the method of 
Towbin et al. {Proc. Nat. Acad Sci. (1979) 75:4350-4354). 

The nitrocellulose membrane is blocked by incubation 
at room temperature for 15 minutes or at 4®C overnight in 
Tris-buffered saline with Tween 20 (Polyoxyethylenesorbitan 
monolaurate) and ^^TTBS-milk", (TTBS = 20mM Tris-HCl, SOOmM 
NaCl, 0.1% Tween 20 (v/v) ^ pH 7.5; ^^TTBS-milk" = TTBS and 
3% skim milk powder) . The volume of liquid in all 
incubations with the nitrocellulose membrane is sufficient 
to cover the membrane completely. The membrane is then 
incubated for an additional 5 minutes in TTBS. 

The nitrocellulose membrane is incubated for at least 
one hour with shaking at room temperature with rabbit anti- 
stearoyl-ACP desaturase antiserum that was diluted 5,000- 
or 10,000-fold in "TTBS-milk". The rabbit anti-desaturase 
antiserum was commercially prepared from desaturase protein 
(purified as described in Example 1) by Berkeley Antibody 
Co. (Richmond,. CA) . The membrane is washed twice by 
shaking with TTBS for 5 minutes and then with deionized H2O 
for 30 seconds. 

The nitrocellulose membrane is incubated for at least 
45 minutes at room temperature in a solution of "TTBS-milk" 
in which anti-rabbit IgG-alkaline phosphatase conjugate 
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(Promega^ Madison^ WI) is diluted 7,500-fold. Th membrane 
is washed twice in TTBS followed by deionized H2O, as 
described above. 

The nitrocellulose membrane is equilibrated in buffer 
5 containing lOOmM Tris-HCl, lOOmM NaCl, 50mM MgCl2r pH 9.5, 
by shaking for 5 minutes . The color reaction is initiated 
by placing the nitrocellulose membrane into 50ml of the 
same buffer to which has been added 15mg p-nitroblue 
tetrazoliiim chloride and 7.5mg 5-br6mo- 4 chloro- 3-indolyl 
10 phosphate toluidine salt (BioRad Labs; Richmond, CA) . The 
color reaction is stopped by rinsing the nitrocellulose 
membrane with deionized H2O and drying it between filter 
papers . 

Oil analysis of developing seeds indicated no 
15 significant change in oil composition of the transformed 

plants with respect to the control plants. This result is 
consistant with the low levels of saff lower mRNA observed 
in transgenic plants as con^ared to levels of endogenous 
Brassica desaturase (Exanqple 12) . 

20 

gaeample 10 

In this example, an -flgrobacteriu/rj-mediated plant 
transformation is described. Brassica napus is 
exemplified. The method is also useful for transformation 
25 of other Brassica species including Brassica campestrls. 

Plant Material and Transformation 

Seeds of Brassica napus cv. Delta are soaked in 95% 
ethanol for 2 min, surface sterilized in a 1.0% solution of 

30 sodium hypochlorite containing a drop of Tween 20 for 45 
min., and rinsed three times in sterile, distilled water. 
Seeds are then plated in Magenta boxes with 1/lOth 
concentration of Murashige minimal organics medium (Gibco) 
supplemented with pyrodoxine (50 \lg/l) , nicotinic acid (50 

35 \Lg/l), glycine (200 Jig/1), and 0.6% Phytagar (Gibco) pH 
5.8. Seeds are germinated in a culture room at 22®C in a 
16 h photoperiod with cool fluorescent and red light of 
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intensity approximately 65 JlEinsteins per square meter per 
second (jlEm~2s-l) . 

Hypocotyls are excised from 7 day old seedlings, cut 
into pieces approximately 4 mm in length, and plated on 
5 feeder plates (Horsch et al. 1985) . Feeder plates are 

prepared one day before use by plating 1.0 ml of a tobacco 
suspension culture onto a petri plate (100x25 mm) 
containing about 30 ml MS salt base (Carolina Biological) 
100 mg/1 inositol, 1.3 mg/1 thiamine-HCl, 200 mg KH2PO4 

10 with 3% sucrose, 2,4-D (1.0 mg/1), 0.6% Phytagar, and pH 
adjusted to 5.8 prior to autoclaving (MSO/1/0 mediiam) . A 
sterile filter paper disc (Whatman 3 mm) is placed on top 
of the feeder layer prior to use. Tobacco suspension 
cultures are subcultured weekly by transfer of 10 ml of 

15 culture into 100 ml fresh MS medium as described for the 
feeder plates with 2,4-D (0.2 mg/1), Kinetin (0.1 mg/1). 
All hypocotyl explants are preincubated on feeder plates 
for 24 h. at 22''C in continuous light of intensity 30 flEm" 
2S""1 to 65 ^EM"2s-l. 

20 Single colonies of A. tumefaclens strain EHAlOl 

containing a binary plasmid are transferred to 5 ml MG/L 
broth and grown overnight at 30°C. Per liter, MG/L broth 
contains 5g mannitol, 1 g L-glutamic acid or 1.15 g sodium 
glutamate, 0.25 g kH2P04/ 0.10 g NaCL, 0.10 g MGSO4-7H20, 1 

25 mg biotin, 5 g tryptone, and 2.5 g yeast extract, and the 
broth is adjusted to pH 7.0. Hypocotyl explants are 
immersed in 7-12 ml MG/L broth with bacteria diluted to 
lxl08 bacteria/ml and after 10-20 min. are placed onto 
feeder plates. After 48 h of co-incubation with 

30 Agjrohacterium, the hypocotyl explants are transferred to B5 
0/1/0 callus induction medium which contains filter 
sterilized carbenicillin (500 mg/1, added after 
autoclaving) and kanamycin sulfate (Boehringer Mannheim) at 
concentrations of 25 mg/1. 

35 After 3-7 days in culture at 65 JlEm""2s-l to 75 jiEm"2s- 

^ continuous light, callus tissue is visible on the cut 
surface and the hypocotyl explants are transferred to shoot 
induction medium, B5BZ (B5 salts and vitamins supplemented 
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with 3 mg/1 benzylamlnopurine, 1 mg/1 zeatln, 1% sucrose, 
0.6% Pliyl:agar and pH adjusted to 5.8) • This medium also 
contains carbenicillin (500 mg/1) and kanamycin sulfate (25 
mg/1). Hypocotyl explants are subcultured onto fresh shoot 
5 induction medium every two weeks . 

Shoots regenerate from the hypocotyl calli after one 
to three months. Green shoots at least 1 cm tall are 
excised from the calli and placed on medium containing B5 
salts and vitamins, 1% sucrose, carbenicillin (300 mg/1) , 

10 kanamycin sulfate (50 mg/1) and 0.6% Phytagar) and placed 
in a culture room with conditions as described for seed 
germination. After 2-4 weeks shoots which remain green are 
cut at the base and transferred to Magenta boxes containing 
root induction medium (B5 salts and vitamins, 1% sucrose, 2 

15 mg/1 indolebutyric acid, 50 mg/1 kanamycin sulfate and 0.6% 
Phytagar) . Green rooted shoots are tested for NPT II 
activity. 

Fi3c ample 11 

20 In this exanqple, a DNA-bombardment plant transformation is 

described. Peanut transformation is exenqplif led. 

DNA sequences of interest may be introduced as 
expression cassettes, comprising at least a promoter 
region, a gene of interest, and a termination region, into 

25 a plant genome via paarticle bombardment as described in 
European Patent Application 332 855 and in co-pending 
application USSN 07/225,332, filed July 27, 1988. 

Briefly, tungsten or gold particles of a size ranging 
from 0.5^M-3HM are coated with DNA of an expression 

30 cassette. This DNA may be in the form of an aqueous 
mixture or a dry DNA/particle precipitate. 

Tissue used as the target for bombardment may be from 
cotyledonary explants, shoot meristems, immature leaflets, 
or anthers. 

35 The bombardment of the tissue with the DNA-coated 

particles is carried out using a Biolistics""* particle gun 

(Dupont; Wilmington, DE) . The particles are placed in the 
barrel at variable distances ranging from lcm-14cm from the 
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barrel mouth. The tissue to be boiribarded is placed beneath 
the stopping plate; testing is performed on the tissue at 
distances up to 20 cm. At the moment of discharge, the 
tissue is protected by a nylon net or a combination of 
5 nylon nets with mesh ranging from 10|IM to 300(1M. 

Following bombardment, plants may be regenerated 
following the method of Atreya, et al., {Plant Science 
Letters (1984) 34:379-383) . Briefly^ embryo axis tissue or 
cotyledon segments are placed on MS medium (Murashige and 

10 Skoog, Physio. Plant. (1962) 15:473) (MS plus 2.0 mg.l 6- 
benzyladenine (BA) for the cotyledon segments) and 
incubated in the dark for 1 week at 25 ± 2°C and are 
subsequently transferred to continuous cool white 
fluorescent light (6.8 W/m^) , On the 10th day of culture, 

15 the plantlets are transferred to pots containing sterile 
soil, are kept in the shade for 3-5 days are and finally 
moved to greenhouse. 

The putative transgenic shoots are rooted. 
Integration of exogenous DNA into the plant genome may be 

20 confirmed by various methods known to those skilled in the 
art . 



EacMttple 12 

This example describes methods to obtain desaturase 
25 cDNA clones from other plant species using the DNA from the 
C. tinctorius A-9 desaturase clone as the probe. 



Isolation of RNA for Northern Analysis 

Poly (A) + RNA is isolated from C. tinctorius embryos 

30 collected at 14-17 days post-anthesis and Simmondsia 
chinensis embryos as described in Example 5. 

Total RNA is isolated from days 17-18 days post- 
eunthesis Brassica caiapestris embryos by an RNA 
minipreparation technique (Scherer and Knauf, Plant Mol. 

35 Biol. (1987) 5:127-134). Total RNA is isolated from R. 
communis immature endosperm of about 14-21 days post- 
anthesis by a method described by Hailing, et al. {Nucl. 
Acids Res. (1985) 13:8019-8033). Total RNA is isolated 
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from 10 g ea;ch of young leaves from B. campestrls, B. 
napuSf and C?. tinctorlus, by extraction of each sample in 5 
ml/g tissue of 4 M guanidine thiocyanate buffer as 
described by Colbert et al. (Proc. Nat. Acac. Sc±. (1983) 
5 8(7:2248-2252) . Total RNA is also isolated from immature 
-embryos of Cuphea hookerlana by extraction as above JLn 10 
ml/g tissue. 

Total RNA is isolated from immature embryos of 
California bay {Umbel lularxa callfornica) by an adaptation 
10 of the method of Lagrimini et al. (Proc. Nat. Acad. Scl. 

(1987) 84:7542-7546) . Following homogenization in grinding 
buffer (2.5 ial/g tissue) as described, RNA is precipitated 
from the aqueous phase by addition of 1/10 volume 3 M 
sodium acetate and 2 volumes ethanol, followed by freezing 
15 at -80OC for 30 minutes and centrifugation at 13,000 x g 
for 20 minutes. The pellets are washed with 80% ethanol 
and centrifugation is repeated as above. The pellets are 
resuspended in water, two volxuaes of 4 M LiCl are added, 
and the sainples are placed at -20®C overnight. Samples are 
20 centrifuged as above and the pellets washed with 80% 
ethanol. Ethanol precipitation is repeated as above. 

Total RNA is further purified from B. campestrls, B. 
napus, and C. tlnctorlus leaves, and from C. tlnctorius, B. 
campestrls, California bay, and jojoba, and from R. 
25 communis immature endosperm, by removing polysaccharides on 
a 0.25 g Sigma Cell 50 cellulose column. The RNA is loaded 
onto the column in 1 ml of loading buffer (20 mM Tris-HCl 
pH 7.5, 0.5M NaCl, ImM EDTA, 0.1% SDS) , eluted with loading 
buffer, and collected in 500 Jll fractions, Ethanol is 
30 added to the samples to precipitate the RNA. The samples 
are centrifuged, and the pellets resuspended in sterile 
distilled water, pooled, and again precipitated in ethanol. 
The sample is centrifuged, and the resulting RNA is 
subjected to oligo(dT) -cellulose chromatography to enrich 
35 for poly (A) + RNA as descrdlbed by Maniatis et al. (Molecular 
Cloning: A Laboratory Manual, Cold Spring Harbor 
Laboratory, New York (1982)). Poly (A) + RNA is also 
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purified from total Cuphea hookBriana. RNA by ollgo(dT)- 
cellulose cliromatography . 

Northern Analysis Using C. tlnctorlus Desaturase 
Clone: 2,5 |Xg of poly (A) + RNA from each of the above 

5 described poly (A) + samples from immature embryos of jojoba, 
Cuphea hookerlana, California bay, Brassica campestrls, and 
C. tlnctorlus f from immature endosperm of communis, and 
from leaves of C. tlnctorlus, B* campestrls, and B. napus 
are elect rophoresed on formaldehyde/agarose gels (Foumey 

10 et al.f Focus (1988) 20:5-7) and transferred to a Hybond-C 
extra (Amersham, Arlington Heights, XL) filter according to 
manufacturer's specifications. The filter is prehybridized 
for four hours and hybridized overnight at 42^0 in a roller 
bottle containing 10 ml of hybridization buffer (1 M NaCl, 

15 1% SDS, 50% formamide, 0*1 mg/ml denatured salmon sperm 

DNA) in a Hybridization Incubator,, model 1040-00-1 (Robbins 
Scientific Corporation, Sunnyvale, CA) . The probe used in 
the hybridization is a gel-isolated Bgrlll fragment of the 
A- 9 desaturase clone that is labeled with ^^19~dCT'9 using a 

20 BRL (Gaithersburg, MD) nick-translation kit, following 
manufacturer's instructions. The blot is washed three 
times for 20 minutes each in 2X SSC, 0.5% SDS at 55^C. The 
blot is e^osed at -80*^C, with a Dupont Cronex intensifying 
screen, to X-ray film for four days. 

25 The autoradiograph shows that the C. tlnctorlus 

desaturase gene is expressed in both immature embryos and 
leaves of C. tlnctorlus, although the level of expression 
is considerably higher in embryos than in leaves. The 
autoradiograph also shows hybridization of the C. 

30 tlnctorlus desaturase clone to roRNA bands of a similar size 
in immature embryos from jojoba and California bay, and 
immature endosperm from R. communis. Hybridization is also 
detectable in RNA from B. cainpestris embryos upon longer 
exposure of the filter to X-ray film. 

35 R. communis cDNA Literary Construction: A plant seed 

cDNA library may be constructed from poly (A) + RNA isolated 
from J^. communis immature endosperm as described above. 
The plasmid cloning vector pCGN1703, and cloning method are 
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as described in Example 5. The R. communis endosperm cDNA 
bank contains approximately 2x10 6 clones with an average 
cDNA insert size of approximately 1000 base pairs. 

The R. communis immature endosperm cDNA bank is moved 
5 into the cloning vector lambda gt22 (Stratagene Cloning 

Systems) by digestion of total cDNA with Notl and ligation 
to lambda gt22 DNA digested with Noti: The resulting phage 
are packaged using a commercially available kit and titered 
using E. coll strain LE392 (Stratagene Cloning Systems^ La 
10 Jolla^ CA) . The titer of the resulting library was 
approximately 1,5 x 10^ pfu/ml. 

R, communis cDNA Library Screen: The library is 
plated on E. coll strain LE3 92 at a density of 
approximately 25,000 pfu/150mm N2Y plate to provide 
15 approximately 50,000 plaques for screening. Phage are 

lifted in duplicate on to NEN (Boston, MA.) Colony /Plaque 
Screen filters as described in Exaxtqple 5. Following 
prehybridization at 42^C in 25 ml of hybridization buffer 
(1 M NaCl, 1% SDS, 50% formamide, 0.1 mg/ml denatured 
20 salmon sperm DNA) filters are hybridized overnight with a 
gel-purified 520 base pair Bgrlll. fragment of the 
tlnctorlus desaturase clone (Figure 7A) that is 
radiolabeled with ^^B-dCTF using a BRL (Gaithersburg, MD) 
Nick Translation System. Filters are washed three times 
25 for 20 minutes each in 2X SSC, 0.5% SDS at 55^C in a 

shaking water bath. Filters are exposed to X-ray film 
overnight at -SO^C with a Dupont Cronex intensifying 
screen. 

Clones are detected by hybridization on duplicate 
30 filters with the C. tlnctorlus desaturase cDNA fragment and 
plaque purified. During plaque purification, it was 
observed that larger plaques were obtained when E. coll 
strain Y109p (Young, R.A. and Davis, R.W., Proc. Natl. 
Acad. Scl. USA (1983) 80:1194) was used as the host 
35 strain. This strain was thus used in subsequent plaque 
purification steps. Phage DNA is prepared from the 
P^^ified clones as described by Grossberger {NAR (1987) 
15:6737) with the following modification. The proteinase K 
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10 



treatment Is replaced by the addition of 10% SDS and a 10 
minute incubation at room temperature. Recovered phage DNA 
is digested with EcoRl, religated at low concentration, and 
transformed into E. coll DH5a(BRL; Gaithersburg, MD) cells 
to recover plasmids containing cDNA inserts in pC6N1703 . 
Minipreparation DNA (Maniatis et al., supra) is prepared 
from the clones and DNA sequence is determined as described 
above. Partial nucleotide sequence of the cDNA insert of a 
R. communis desaturase clone pCGN3230 is presented in 
Figure 3A and" SEQ ID NO: 14. The complete nucleotide 
sequence of this clone is presented in Fig. 3B and as SEQ 
ID NO: 15. 

Northern Analysis Using R. communis Desaturase Clone: 
Total RNA for Northern analysis is isolated from tobacco 
15 leaves by the method of Ursin et al. {Plant Cell (1989) 
1:727-736), petunia and tomato leaves by the method of 
Ecker and Davis IProc.Nat.Acad.Scl. (1987) 84:5202-5206), 
and corn leaves by the method of Turpen and Griffith 
(Bioteclmlgrues (1986) 4:11-15). Total RNA sanples from 
20 tobacco, corn, and tomato leaves are enriched for poly (A) + 
RNA by oligo (dT) -cellulose chromatography as described by 
Maniatis et al. (supra). 

Poly (A) + RNA samples from tomato leaves (4 fig) and 
corn and tobacco leaves (1 ng each) , and total RNA from 
25 petunia leaves (25 Hg) are electrophoresed on a 

formaldehyde/agarose gel as described by Shewmaker et al. 
(Virology (1985) 240:281-288). Also electrophoresed on 
this gel are poly (A) + RNA samples isolated from B. 
campestris day 17-19 embryos and B. campestrls leaves (2 Hg 
30 each) , inonature embryos from C. tinctorlus, bay, and jojoba 
(1 fig each) , and R. communis endosperm (1 ^.g) . The 
isolation of these poly (A) + rna samples is described above 
for the Northern analysis using C. tinctorlus desaturase 
CDNA as probe. The RNA is transferred to a nitrocellulose 
35 filter as described by Shewmaker et al. (supra) and 

prehybridized and hybridized at A2°C in 50% formamide, lOX 
Denhardfs solution (described in Maniatis et al. (supra)), 
5X SSC, 0.1% SDS, 5 mM EDTA, 100 ug/ml denatured salmon 
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sperm DNA, and 10% dextran sulfate (in hybridization buffer 
only) . The probe for hybridization is the 32p-iabeled (BRL 
Nick Translation Kit) 1.7 kb Sail insert of pCGN3230 that 
has been gel-purified from minipreparation DNA. The filter 
5 is washed following hybridization for 30 minutes in 2X SSC, 
0.1% SDS at 420C and at 50®C twice for 15 minutes each. 
The filter is exposed to X-ray film overnight at -80^0 with 
a Dupont Cronex intensifying screen. 

The autoradiograph shows hybridization of the JR. 
10 communis desaturase clone to mRNA bands of a similar size in 
immature embryos from B. caiapestriSr California bay, and C. 
tlnctorlusr and also in corn leaves and K. coininunis 
endosperm. 

B. campestris Embryo cDNA Library Construction: Total 

15 RNA is isolated from 5 g of B. campestris cv. R500 embryos 
obtained from seeds harvested at days 17-19 post-anthesis . 
RNA is extracted in 25 mis of 4 M guanidine thiocyanate 
buffer as described by Colbert et al. (PNAS (1983) 80:2248- 
2252) • Polysaccharides are removed from the RNA santple by 

20 resuspending the pellet in 6 ml of IX TE (10 mM Tris/1 iriM 
EDTA pH 8), adding potassitam acetate to a concentration of 
0.05M, and adding one half volume of ethanol. The sample is 
placed on ice for 60 minutes and centrifuged for 10 minutes 
at 3000 X g. RNA is precipitated from the supernatant by 

25 adding sodium acetate to a concentration of 0.3 M followed 

by the addition of two volumes of ethanol. RNA is recovered 
from the sample by centrifugation at 12,000 x g for 10 
minutes and yield calculated by UV spectrophotometry. Two 
mg of the total RNA is further purified by removing 

30 polysaccharides on a 0,25 g Sigma Cell 50 cellulose column, 
as described above^ and is also enriched for poly (A) + RNA by 
oligo (dT) -cellulose chromatography as described above. 

A B. cainpestrls day 17-19 post anthesis embryo cDNA 
library is constructed in plasmid vector pCGN1703 as 

35 described in Example 5, using 5 ug of the above described 

poly (A) + BNA. The library,, which consists of approximately 
1.5 X 10^ transformants^ is amplified by plating and 
scraping colonies / and is stored as frozen E. coli cells in 
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10% DMSO at -80^ C. DNA is isolated from a portion of the 
amplified library by scaling up the alkaline lysis 
technique of Birnboim and Doly (Nucleic Acids Res. (1979) 
7:1513), and purified by CsCl centrifugation . Library DNA 
5 is digested with EcoRl and is cloned into -ETcoRI-digested 
bacteriophage lambda gtlO (Stratagene; La Jolla, CA) DNA. 
The DNA is packaged using Gigapack II Gold in vitro 
packaging extracts (Stratagene; La Jolla, CA) according to 
manufacturer's specifications. The titer of the phage 
10 stock, determined by dilution plating of phage in coli 
C600 hfl- cells (Huynh, et al., DNA Cloning. Volume I. 
Eds. Cover, D.M. (1985) IRL Press Limited: Oxford, 
England, pp. 56,110), is 6 x 10^ pfu per ml. 

B. campestris cDNA Library Screen: The library is 
15 plated on E. coli strain C600 hfl- at a density of 
approximately 30,000 pfu/150mm N2Y plate to provide 
approximately 120,000 plaques for screening. Phage are 
lifted in duplicate on to NEN (Boston, MA.) Colony/Plaque 
Screen filters as described in Example 5. Filters are 
20 prehybridized and hybridized with the 32p^iabeled fragment 
of PCGN3230 as described above for the Northern 
hybridization. Filters are washed for 30 minutes in 2X 
SSC, 0.1% SDS at 50^C and at 55^0 twice for 15 minutes 
each. Filters are exposed to X-ray film overnight at -80«>C 
25 with a Dupont Cronex intensifying screen. 

Clones are detected by hybridization on duplicate 
filters to the R. coznmunis desaturase cDNA fragment and 
plaque purified. During plaque purification, the probe 
used was a gel-purified 1.4 kb SstI fragment of pCGN3230 
30 which lacks the poly (A) + tail. As described above, phage 
DNA is isolated from purified lambda clones, digested with 
EcoRI, ligated, and transformed to E. coli DH5a cells. 
Minipreparation DNA is prepared and partial DNA sequence 
determined as described above. Partial DNA sequences of 
35 two clones, pCGN3235 and pCGN3236, are presented in Figure 
4A (SEQ ID NO: 17) and 4B (SEQ ID NO: 18), respectively. 
Initial DNA sequence analysis of the 3' regions of these 
clones indicates that pCGN3236 and pCGN3235 are cDNA 
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clones from the same gene. pCGN3236 is a shorter clon 
than pCGN3235> which appears to contain the entire coding 
region of the B. caii^estris desaturase gene. The complete 
nucleotide sequence of pCGN3235 is presented in Figure 4C 
5 and SEQ ID NO: 19, 

Desaturase Gene Analysis: Southern and Northern 
analyses of Brass ica species are conducted to determine the 
number of genes which encode the Brass ica desaturase clone, 
PCGN3235 in B. campestris, B. oleracea, and B. napus, and 
10 the timing of escpression of the gene in B. campestris 

developing seeds. DNA is isolated from leaves of each of 
the above-named: Brass ica species by the method of Bernatzky 
and Tanksley (Theor. Appl. Genet. (1986) 72:314-321) . DNA 
from each of the species is digested with restriction 
15 endonucleases EcoRI and Xbal (10 ug/digest) , 

electrophoresed in a 0.7% agarose gel, and transferred to a 
nitrocellulose filter (Maniatis et al., supra). The filter 
is prehybridized and hybridized at 42^0 (as described above 
for Northern analysis using JR. communis desaturase clone) 
20 with a ^^B-laJDeledL (nick translation) gel-isolated 

Hindlll/PvuII fragment of pCGN3235 (Fig. 7C) . The filter 
is washed following overnight hybridization, for 30 minutes 
at 55^0 in IX SSC, 0.1% SDS, followed by two 15 minute 
washed at 55^C in O.IX SSC, 0.1% SDS. 

The autoradiograph indicates that the Brassica 
desaturase is encoded by a small gene family consisting of 
about two genes in B. campestris and B. oleracea, and about 
four genes in B. napus. 

The timing of eacpression of the desaturase gene during 
seed development is determined by Northern analysis. RNA 
is isolated from immature seeds of B. campestris cv. R500 
collected at 11, 13, 15, 17, 19, 21, 25, 30, 35, and 40 
days post-anthesis. Total RNA is isolated as described by 
Scherer aiid Knauf (Plant Mol. Biol. (1987) 5:127-134) . 
Twenty five micrograms of RNA from each time point are 
electrophoresed through a formaldehyde-containing 1.5% 
agarose gel as described by Shewmaker, et al. (supra) and 
blotted to nitrocellulose (Thomas, Proc. Nat. Acad. Sci. 
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(1980) 77:5201-5205) . The blot is pre-hybridized and 
hybridized at 42^0 with the 32p.iabeled Hindlll/PvuII 
fragment of pCGN3235 as described above. The filter is 
washed following overnight hybridization, for 30 minutes at 
5 550c in IX SSC, 0.1% SDS, followed by two 15 minute washed 
at 550c in 0,1X SSCr 0.1% SDS. 

The autoradiograph indicates that the desaturase gene 
is expressed in B. campestris developing seeds beginning at 
about day 19 and through about day 30, with maximal 
10 expression at day 25. By a similar Northern analysis, the 
level of desaturase mRNA in developing Brassica napus seeds 
(day 21) was estimated to be approximately 1% of the total 
mRNA. 

Isolation of Other Desaturase Gene Sequences : cDNA 
15 libraries may be constructed as described above and genomic 
libraries can be constructed from DNA from various sources 
using commercially available vectors and published DNA 
isolation, fractionation, and cloning procedures. For 
example, a B. campestris genomic library can be constructed 
20 using DNA isolated according to Scofield and Crouch 

(J-Biol.Chem. (1987) 262:12202-12208) that is digested with 
BamHI and fractionated on sucrose gradients (Maniatis et 
al.^ supra), and cloned into the lambda phage vector 
LambdaGem-11 (Promega; Madison, WI) using cloning procedures 
25 of Maniatis et ai. (supra). 

CDNA and genomic libraries can be screened for 
desaturase cDNA and genomic clones, respectively, using 
published hybridization techniques. Screening techniques 
are described above for screening libraries with DNA 
30 fragments. Libraries may also be screened with synthetic 
oligonucleotides, for example using methods described by 
Berent et aJ. (BloTechniques (1985) 3:208-220).' Probes for 
the library screening can be prepared by PGR, or from the 
sequences of the desaturase clones provided herein. 
35 Oligonucleotides prepared from the desaturase sequences may 
be used, as well as longer DNA fragments, up to the entire 
desaturase clone. 
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For exait^le, jojoba polyadehylated KNA Is used to 
cons-truct a cDNA library in the cloning vector XZAPlI/ScoRI 

(Stratagene, San Diego, CA) . RNA is isolated from jojoba 
embryos collected at 80-90 days post-anthesis by isolating 
5 polyribosomes using a method initially described by Jackson 
and Larkins (Plant Physiol. (1976) 57:5-10) and modified by 
Goldberg et al. (Developmental Biol. (1981) 83:201-217). 
Polysaccharide contcuninants in the polyribosomal RNA 
preparation are removed by running the RNA over a cellulose 

10 column (Sigma-cell 50) in high salt buffer (0.5M NaCl, 20mM 
Tris pH 7.5, ImM BDTA, 0.1% SDS) . The contaminant binds to 
the colximn and the RNA. is collected in the eluant. The 
eluant fractions are pooled and the RNA is ethanol 
precipitated- The precipitated total RNA is then 

15 resuspended in a smaller volume and applied to an oligo 
d(T) cellulose column to isolate the polyadenylated RNA. 

The library is constructed using protocols, DNA and 
bacterial strains as supplied by the manufacturer. Clones 
are packaged using Gigapack Gold packaging extracts 

20 (Stratagene) , also according to manufacturer's 

recommendations . The cDNA library constructed in this 
manner cent ins approximately 1 x 10^ clones with an average 
cDNA insert size of approximately 400 base pairs. 

The jojoba library is plated on B. coll XLl-Blue 

25 (Stratagene) at a density of approximately 5000pfu/150mm 
plate to provide approximately 60,000 plaques for 
screening. Phage are lifted onto duplicate nylon membrane 
filters as described previously. Filters are prehybridized 
at 420C in a hybridization buffer containing 40% formamide, 

30 lOX Denhardt's solution, 5X SSC, 0.1% SDS, 50mM EDTA, and 
lOOjig/ml denatured salmon sperm DNA. Hybridization is at 

42^0 in the same buffer with added nick translated (BRL 
Nick Translation System) 520 bp Bglll fragment of the C. 
tinctorlus desaturase clone described previously. Filters 
35 are washed at 50^C in 2X SSC and exposed to X-ray film 
overnight . 

Desaturase clones are detected by hybridization on 
duplicate filters with the C. tinctorlus cDNA fragment and 
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plaque-purified. Positive clones are recovered as plasmids 
in E. coll following manufacturer's directions and 
materials for In vivo excision. Partial, preliminary DNA 
secjuence of a clone, 3-1, is determined and the 
5 corresponding amino acid sequence is tremslated in three 
frames. In this loanner, homology to the C« tinctoirius 
desaturase cDNA clone is detected in one reading frame. 
The preliminary DNA sequence of this jojoba desaturase cDNA 
fragment is shown in F: gure 5 (SEQ ID NO: 43) . Also shown 

10 is the corresponding translated amino acid sequence in the 
reading frame having C. tlnctorlus desaturase homology. 
The jojoba cDNA fragment is approximately 75% homologous at 
the DNA level and approximately 79% homologous at the amino 
acid level contpared to sequence of the C. tlnctorlus 

15 desaturase in this region. 

Bgamplft 12. 

Antisense constructs are described which allow for 
transcription of a reverse copy of the B, campestrls 
20 desaturase cDNA clone in the 5' to 3 ' orientation of 
transcription • 

Preferential Expression of Antisense Constructs In Embryos 
In order to reduce the transcription of a desaturase 

25 gene in embryos of napus or B. campestrls, constructs 
may be prepared which allow for production of antisense 
copies of the desaturase cDNA preferentially in the 
embryos. Promoter sequences which are desirable to obtain 
this pattern of expression include, but are not limited to, 

30 the ACP, Bce4, and napin 1-2 expression cassettes described 
in Exaiqples 7, 8, and 9, respectively. It also may be 
desirable to control the expression of reverse copies of 
the desaturase cDNA under two different promoters in the 
same transformed plant to provide for a broader timing of 

35 expression of the antisense desaturase DNA. For example,^ 
expression from the ACP promoter may begin and end earlier 
than expression from the napin promoter. Thus, expressing 
the reverse desaturase from both promoters may result in 
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the production of the antisense strand of DNA over a longer 
period of embryo development, 

An example of expression of an antisense desaturase 
gene preferentially in the embryos is provided below. 
5 Similar constructs containing the same or a different 

fragment of the desaturase gene and any of the promoters 
described above^ as well as other promoter regions which 
may be useful; may also be prepared using gene cloning, 
insertion, mutation and repair techniques well Icnown to 
10 those of ordinary skill in the art. 

A. Antisense Desaturase Expression from th e ACP Promoter 
Construction of pCGN3239 is as follows: 
pCGN3235 (Example 12) is digested with Pvull and 
15 Hliidlll and the HindXll sticky ends are filled in with 
Klenow in the presence of 200 dNTPs. The 1.2 kb 
PvuII/irindlll fragment containing the desaturase coding 
sequence is gel purified and ligated in the antisense 
orientation Into i?coRV-digested pCGN1977 (ACP expression 
20 cassette; described in Exanqple 7) to create pCGN3238. The 
4.2 kb X3:>al/Aspll8 fragment of pCGN3238 containing the 
antisense desaturase in the ACP cassette is trajisferred 
into Xba I /A5p7 18 -digested pCGN1557 (binary transformation 
vector; described in Example 7) to create pCGN3239. 
25 B. Antisense Desaturase Esgpresaion From The Napin 

Promoter 

Construction of pCGN3240 is as follows: pCGN3235 is 
digested with PvuII and Hlndlll, the sticky ends are 
blunted, and the resulting fragment is inserted in an anti- 
30 sense orientation into pCGN3223 which has been digested 
with Sail and blunted with Klenow enzyme. The resulting 
plasmid, pCGN3240 will express an anti-sense desaturase RNA 
from the napin promoter cassette. 

C. Antisense Desaturase Expression From a Dual Promoter 

35 Cassette 

Construction of pCGN3242 is as follows: An Asp718 
fragment of pCGN3240 containing the napin 5' and 3' regions 
surrounding the desaturase sequences is Inserted into the 
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AspllS Site of PCGN3239 (a binary vector containing an ACP 
promoter, antisense desaturase construct) to create 
PCGN3242. 

5 Constitutive Transcription 

A. Binary Vector Construction 

1. Construction of pCGP291. 

The KpnZ, BaitiEl, and Xbal sites of binary vector 
PCGN1559 (McBride and Suminerfelt, Pl.Mol.Blol. (1990) 14: 
10 269-276) are removed by Asp718/Xbal digestion followed by 
blunting the ends and recircularization to produce pCGP67. 
The 1.84 kb Pstl/Hindlll fragment of pCGN986 containing the 
35S promoter-tml3 ' cassette is inserted into Pstl/Hindlll 
digested pCGP67 to produce pCGP291. 
15 2. Construction of pC6N986. 

The 35S promote r-t ml 3 ' expression cassette, pCGN986, 
contains a cauliflower mosaic virus 35S (CaMV35) promoter 
and a T-DNA tml 3 '-region with multiple restriction sites 
between them. pCGN986 is derived from another cassette, 
20 PCGN206, containing a CaMV35S promoter and a different 3' 
region, the CaMV region VI 3'-end, The CaMV 35S promoter 
is cloned as an AIuI fragment (bp 7144-7734) (Gardner et. 
al., Nucl.Aclds Res. (1981) 9:2871-2888) into the Hindi 
site of M13mp7 (Messing, et. al., Nucl. Acids Res. (1981) 
25 9:309-321) to create C614. An £coRI digest of C614 

produced the £?coRI fragment from C614 containing the 35S 
promoter which is cloned into the EcdRT site of pUC8 
(Vieira and Messing, Gene (1982) 19:259) to produce 
PCGN147. 

30 pCGN148a containing a promoter region, selectable 

marker (KAN with 2 ATG's) and 3' region, is prepared by 
digesting pCGN528 with Bglll and inserting the BajnHI-Bglll 
promoter fragment from pCGN147. This fragment is cloned 
into the Bglll site of pCGN528 so that the Bglll site is 

35 proximal to the kanamycin gene of pCGN528. 

The shuttle vector used for this construct, pCGN528, 
is made as follows: pCGN525 is made by digesting a plasmid 
containing Tn5 which harbors a kanamycin gene (Jorgenson 
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et. al.r Mol. Gen. Genet. (1979) 177:65) witli Hindlll-BainHl 
and Inserting the HlndlH-BaitiEl fragment containing the 
kanamycin gene into the HlndlXl^BaitiEZ sites in the 
tetracycline gene of pACYC184 (Chang and Cohen, J*. 
5 Bacterlol. (1978) 154:1141-1156). pCGN526 was made by 
insertdLng the BamEl fragment 19 of pTiA6 (Thomashow et. 
al.. Cell (1980) 15:729-739), modified with Xhol linkers 
inserted into the Smal site, into the BamHl site of 
PC6N525. PCGN528 is obtained by deleting the small Xhol 
10 fragment from pCGN526 by digesting with Xhol and 
religating. 

pCGN149a is made by cloning the BajRHI-kanamycin gene 
fragment from pMB9KanXXI into the BamHI site of pCGN148a. 
pMB9KanXXI is a pUC4K variant (Vieira and Messing, Gene 

15 (1982) 19:259-268) which has the Xhol site missing, but 
contains a functional kanamycin gene from Tn903 to allow 
for efficient selection in Affrobacterlum. 

pCGN149a is digested with Hlndlll and BairiHI and 
ligated to pUC8 digested with BindXII and BajizHI to produce 

20 pCGN169. This removes the Tn903 kanamycin marker. pCGN565 
and PCGN169 are both digested with Blndlll and PstI and 
ligated to form pCGN203, a plasmid containing the CaMV 35S 
promoter and part of the 5 '-end of the Tn5 kanamycin gene 
(up to the PstI site, Jorgenson et. al,, (1979), supra), A 

25 3 '-regulatory region is added to pCGN203 from pCGN204, an 
BcoRI fragment of CaMV (bp 408-6105) containing the region 
VI 3' cloned into pUC18 (Yanisch-Perron, et al.. Gene 
(1985) 33:103-119) by digestion with Blndlll and PstI and 
ligation. The resulting cassette, pCGN206, is the basis 

30 for the construction of pCGN986. 

The pTiA6 T-DNA tml 3 '-sequences are subcloned from 
the Bainl9 T-DNA fragment (Thomashow et al., {i980) supra) 
as a BairiHI-BcoRI fragment (nucleotides 9062 to 12,823, 
numbering as iii Barker et al.r Plant Mol. Biol. (1982) 

35 2:335-350) and combined with the pACYC184 (Chang and Cohen 
(1978), supra) origin of replication as an BcoRI-^indlll 
fragment and a gentamycin resistance marker (from plasmid 
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pLB41), obtained from D. Figurski) as a BaitiEl-Hlndlll 
fragment to produce pCGN417. 

The unique Smal site of pCGN417 (nucleotide 11^207 of 
the Baml9 fragment) is changed to a Sad site using linkers 
5 and the BainHI-SacI fragment is subcloned into pCGN565 to 
give pCGN971. The BajnHI site of pCGN971 is changed to an 
EcdRl site using linkers. The resulting EcoRl-SacX 
fragment containing the tml 3' regulatory sequences is 
joined to pCGN206 by digestion with EcoRl and Sad to give 

10 pCGN975. The small part of the Tn5 kanamycin resistance 

gene is deleted from the 3 '-end of the CaMV 35S promoter by 
digestion with Sail and Sglll, blunting the ends and 
ligation with Sail linkers. The final expression cassette 
pCGN986 contains the CaMV 35S promoter followed by two Sail 

15 sitesr an Xbal site, Ban&l, Sxnal, Kpnl and the tml 3' 
region (nucleotides 11207-9023 of the T-DNA) . 
B. Insertion of Desaturase Sequence 

The 1.6 kb Xbal fragment from pCGN3235 containing the 
desaturase cDNA is inserted in the antisense orientation 
20 into the Xbal site of pCGP291 to produce pCGN3234 . 

Plant Transformation 

The binary vectors containing the expression cassette 
and the desaturase gene are transformed into AgroJbacterium 
25 tumefaciens strain EHAlOl (Hood, et al., J*. Bacteriol. 
(1986) 155:1291-1301) as per the method of Holsters, et 
al., Mol. Gen. Genet. (1978) 153:181-187, Transformed B. 
napus and/or Brassica campestrls plants are obtained as 
described in Example 10. 

30 

Analysis of Transc^enlc Plants 

A. Analysis of t>CGN3242 Transformed Rrasslca Gamp^sHrls 
cv. Tobln Plan-hci 

Due to the self-incompatibility of Brassica campestrls 
35 cv. Tobin, individual transgenic plants are pollinated 

using non-transformed Tobin pollen. Because of this, the 
T2 seeds of a transgenic plant containing the antisense 
desaturase at one locus would be expected to segregate in a 
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1:1 ratio of transformed to non- trans formed seed. The oil 
composition of ten individual seeds collected at 26 days 
post-anthesis from several pCGN3242-transformed plants and 
one non-transformed control was analyzed by gas 
5 chromatography according to the method of Browse, et al.. 
Anal. BlocheJD. (1986) 152:141-145. One transformant, 3242- 
T-1, exhibits an oil composition that differed distinctly 
from controls on preliminary analysis. The control Tobin 
seeds contained an average of 1.8% 18:0 (range 1.5% - 2.0%) 

10 and 52.9% 18:1 (range 48.2% - 57.1%). T2 seeds of 3242-T-l 
segregated into two distinct classes. Five seeds contained 
levels of 18:0 ranging from 1.3% to 1.9% and levels of 18:1 
ranging from 42.2% to 58.3%. The other five seeds 
contained from 22.9% to 26.3% 18:0 and from 19.9% to 26.1% 

15 18:1. 

B. Analvsis of PCGN3234 Trans foi-m^rt P1an1-.Q 

Some abnormalities have been observed in some 
transgenic Brassica napus cv. Delta and Bingo ajcid Brassica 
campestris cv. Tobin plants containing pCGN3234. These 

20 effects could be due to the constitutive ea^ression of 

antisense desaturase RNA from the 35S promoter or could be 
due to the transfoxmation/t issue culture regime the plants 
have been subjected to. 

The above results demonstrate the ability to obtain 

25 plant A-9. desaturases/ isolate DNA sequences which encode 
desaturase activity and mgmipulate them. In this way^ the 
production of transcription cassettes, including expression 
cassettes can be produced which allow for production, 
including specially differentiated cell production of the 

30 desired product. A purified C. tinctorlus desaturase is 
provided and used to obtain nucleic acid sequences of C. 
tinctorlus desaturase. Other plamt desaturase sequences 
are provided such as R. cummunls^ B. campestris, and S. 
chlnensis. These sequences as well as desaturase sequences 

35 obtained from them may be used to obtain additional 
desaturease, and so on. And, as described in the 
application modification of oil composition may b 
achieved. 
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All piablications and patent applications mentioned in 
this specification are indicative of the level of skill of 
those skilled in the art to which this invention pertains. 
5 All publications aucid patent applications are herein 

incorporated by reference to the same extent as if each 
individual publication or patent application was 
specifically and individually indicated to be incorporated 
by reference. 

10 Although the foregoing invention has been described in 

some detail by way of illustration and example for purposes 
of clarity of understanding, it will be obvious that 
certain changes and modifications may be practiced within 
the scope of the appended claim. 
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What is claimed is: 

!• .A recombinant DNA construct comprising a sequence 
encoding at least a portion of a plant desaturase, said 
desaturase when mature having activity toward an 
5 unsaturated fatty acid substrate. 

2. The construct of Claim 1 encoding a biologically 
active plant desaturase. 

3. The construct of Claim 1 wherein said sequence 
encodes a precursor desaturase, 

10 4 . The construct of Claim 1 wherein said sequence 

encodes a mature desaturase. 

5. The construct of Claim 1 wherein said sequence 
encodes a transit peptide. 

6. The construct of Claim 1 comprising a cDNA 
15 sequence. 

7 . The construct of Claim 1 wherein said sequence is 
joined to a second nucleic acid sequence which is not 
naturally joined to said first sequence. 

8. The construct of Claim 1 comprising, in the 5' to 
20 3' direction of transcription, a tremscriptional regulatory 

region functional in a host cell and said sequence. 

9. The construct of Claim 8 further comprising, a 
translational regulatory region immediately 5' to said 
sequence and a transcriptional/translational termination 

25 regulatory region 3 ' to said sequence, wherein said 
regulatory regions are functional in said host cell. 

10. The construct of Claim 8 wherein said sequence is 
a sense sequence. 

11. The construct of Claim 8 wherein said sequence is 
30 an £uiti-sense sequencie. 

12. The construct of Claim 8 wherein said host cell 
is a plant cell . 

13. The construct of Claim 12 wherein said 
transcriptional initiation region is obtained from a gene 

35 preferentially expressed in plant seed tissue during lipid 
accumulat ion • 

14. The construct of Claim 13 wherein said 
transcriptional initiation region is selected from the 
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regulatory region 5' upstream to a structural gene of the 
group consisting of any one of Bce4, seed ACP Beg 4-4 and 
napin 1-2 . 

15. The construct of Claim 9 wherein said 

5 transcriptional termination region is a plant desaturase 
termination region. 

16. The construct of Claim 1 wherein said plant 
desaturase is a A-9 desaturase. 

17. The construct of Claim 1 wherein said sequence is 
10 obtainable from any one of C. tlnctojrius, R, communinls and 

B . campestrl s . 

18. A method of modifying fatty acid composition in a 
plant host cell from a given percentage of fatty acid 
saturation to a different percentage of fatty acid 

15 saturation con^rising 

growing a host plant cell having integrated into its 
genome a recombinant DNA sequence encoding a plant 
desaturase under the control of regulatory elements 
functional in said plant cell during lipid accumulation 

20 under conditions which will promote the activity of said 
regulatory elements . 

19. The method of Claim 18 wherein the overexpression 
of plant desaturase is obtained. 

20. The method of Claim 18 wherein the decrease of 
25 endogenous plant desaturase is obtained. 

21. The method of Claim 18 wherein said regulatory 
elements function preferentially in plant seed. 

22. The method of Claim 20 wherein the percentage of 
long chain unsaturated fatty acids is increased. 

30 23. A plant cell having a modified level of saturated 

fatty acids produced according to the method of any one of 
Claims 18-22. 

24. The plant cell of Claim 23 wherein said cell is a 
Brassica plant cell. 

35 25. The plant cell of Claim 23 wherein said cell is 

In vivo. 

26. The plant cell of Claim 23 wherein said cell is 
an oilseed embryo plant cell. 
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21. A plant seed having a modified level of saturated 
fatty acids as conqpared to a seed of said plant having a 
native level of saturated fatty acids produced according to 
a method comprising 
5 growing a plants having integrated into the genome of 

embryo cells a recoinbinsuit DNA sequence encoding a plant 
desaturase under the control of regulatory elements 
functional in seed during lipid accumulation,^ to produce 
seed under conditions which will promote the activity of 
10 said regulatory elements, and 
harvesting said seed. 

28. The seed of Claim 27 wherein said plant is 
Bx-assica napus. 

29. The seed of claim 27 wherein said seed is an 
15 oilseed. 

30. The seed of Claim 27 wherein said plant 
desaturase is a A-9 desaturase. 

31. A pl€uxt seed oil of a plant having aui endogenous 
level of saturated fatty acids comprising a pleuat seed oil 

20 having a modified level of saturated fatty acids. 

32. The oil of Claim 31 comprising a Brasslca napus 

oil . 

33. A plant seed oil separated from an seed produced 
according to any one of Claims 27-30. 

25 34. A host cell comprising a plant desaturase 

encoding sequence of any one of Claims 1-17 . 

35. The cell of Claim 34 wherein said cell is a plant 

cell. 

36. The cell of Claim 35 wherein said plant cell is 
30 In vivo. 

37. The cell of Claim 35 wherein said plant cell is a 
Brasslca plant cell. 

38. A transgenic host cell comprising an expressed 
plant desaturase. 

35 39. The cell of Claim 38 wherein said host cell is a 

plant cell. 

40. The cell of Claim 38 wherein said plant 
desaturase is a A-9 desaturase. 
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41. A method of producing a plant desaturase in a 
host cell or progeny thereof comprising 

growing a host cell or progeny thereof comprising a 
construct of any one of Claims 1-10 and 12-17 under 
5 conditions which will permit the production of said plant 
desaturase • 

42. The method of Claim 41 wherein said host cell is 
a plant cell and said construct is integrated into the 
genome of said plant cell. 

iO 43. The method of Claim 42 wherein said plant cell is 

In vivo. 

44. A host cell comprising a plant desaturase 
produced according to Claim 41. 

45. The cell of Claim 45 wherein said host cell is a 
15 plant host cell and said construct is integrated into the 

genome of said plant cell. 
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TTTIiE 

FJiTTY ACID DESATURASE GENES FROM PLANTS 
PTP.T.n OF THE INVENTION 

The invention relates to the preparation and use of 
5 nucleic acid fragments encoding fatty acid desaturase 
enzymes to modify plant lipid composition. 

RArKRRQflWn OF THE INVENTION 

Plant lipids have a variety of industrial and 
nutritional uses and are central to plant membrane 

10 function and -slimatic adaptation. These lipids 

represent a vast array of chemical structures, and these 
structures determine the physiological and industrial 
properties of the lipid. Many of these structures 
result either directly or indirectly from metabolic 

15 processes that alter the degree of unsaturation of the 
lipid. Different metabolic regimes in different plants 
produce these altered lipids, and either domestication 
of exotic plant species or modification of agronomically 
adapted species is usually required to economically 

20 produce large aunounts of the desired lipid. 

Plant lipids find their major use as edible oils in 
the form of triacylglycerols . The specific performance 
and health attributes of edible oils are determined 
largely by their fatty acid composition. Most vegetable 

25 oils derived from commercial plant varieties are 

con^osed primarily of palmitic <16:0), stearic (18:0), 
oleic (18:1), linoleic (18:2) and linolenic (18:3) 
acids. Palmitic and stearic acids are, respectively, 
16- and 18-carbon-long, saturated fatty acids. Oleic, 

30 linoleic, and linolenic acids are 18-carbon-long, 

unsaturated fatty acids containing one, two, and three 
double bonds, respectively. Oleic acid is referred to 
as a mono-unsaturated fatty acid, while linoleic and 
linolenic acids are referred to as poly-unsaturated 

35 fatty acids. The relative amounts of saturated and 
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unsaturated fatty add^ In coamonly used, edible 
vegetable oils are sunmarized below (Table 1) : 



Percentages of Saturated and Unsaturated Fatty 
flp-jH« -in i-b*. nil.o o-F fiAl«*ffted Qtl Crops 













Saturated 


unsatusatfid 


un saturated 




6% 


58% 


36% 


Sovbean 


15% 


24% 


61% 


Com 


13% 


25% 


62% 


Peanut 


18% 


48% 


34% 


Safflower 


9% 


13% 


78% 




9% 


41% 


51% 


Cotton 


30% 


19% 


51% 



Many recent research efforts have examined the role 
that saturated and unsaturated fatty acids play in 
5 reducing the risk of coronary heart disease. In the 

past, it was believed that mono-unsaturates, in contrast 
to saturates and poly-unsaturates, had no effect on 
serum cholesterol and coronary heart disease risk. 
Several recent human clinical studies suggest that diets 

10 high in mono-unsaturated fat and low in saturated fat 
may reduce the "bad" (low-density lipoprotein) 
cholesterol while maintaining the "good" (high-density 
lipoprotein) cholesterol (Mattson et al.. Journal of 
Lipid Research (1985) 26:194-202). 

15 A vegetable oil low in total saturates and high in 

mono-unsaturates would provide significant health 
benefits to consumers as well as economic benefits to 
oil processors. As an exaniple, canola oil is considered 
a very healthy oil. Howeverr in use, the Ixigh level of 

20 poly-xmsaturated fatty acids in canola oil renders the ^ - 

oil unstable, easily oxidized, and susceptible to •« 
development of disagreeable odors and flavors -s^^^ 
(Gailliard, 1980, Vol. 4, pp. 85-116 In.: Stun5>f, P. K., 
Ed., The Biochemistry of Plants, Academic Press, New 
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York) . The levels of poly-iansaturates may be reduced by 
hydrogenatlon, but the expense of this process and the 
concomitant production of nutritionally questionable 
-brans isomers of the remaining unsaturated fatty acids 
5 reduces the overall desirability of the hydrogenated oil 
(Mensink et al.^ New England J. Medicine (1990) N323: 
439-445) . Similar problems exist with soybean and corn 
oils . 

For specialized uses,, high levels of poly- 

10 unsaturates can be desirsUale. Linoleate and linolenate 
are essential fatty acids in hiiman diets, and an edible 
oil high in these fatty acids can be used for 
nutritional supplements, for example in baby foods • 
Linseed oil, derived from the Flax plant (Linum 

15 usitatissimum) , contains over 50% linolenic acid and has 
widespread use in domestic and industrial coatings since 
the double bonds of the fatty acids react rapidly with 
oxygen to polymerize into a soft and flexible film. 
Although the oil content of flax is comparable to canola 

20 (around 40% dry weight of seed) , high yields are only 
obtained in warm tertqperatures or subtropical climates. 
In the USA flax is highly susceptible to rust infection. 
It will be commercially useful if a crop such as soybean 
or canola could be genetically transformed by the 

25 appropriate desaturase gene(s) to synthesize oils with a 
high linolenic acid content. 

Mutation-breeding programs have met with some 
success in altering the levels of polyunsaturated fatty 
acid levels found in the edible oils of agronomic 

30 species. Examples of commercially grown varieties are 
high (85%) oleic sunflower and low (2%) linolenic flax 
(Knowles, (1980) pp. 35-38 In: Applewhite, T. H., Ed., 
World Conference on Biotechnology for the Fats and Oils 
Industry Proceedings, American Oil Chemists' Society). 

35 Similar commercial progress with the other plants shown 



wo 93/11245 



4 



PCr/US92/10284 



in Table 1 lias been largely elusive due to tbe difficult 
nature of the procedure and the pleiotropic effects of 
the mutational regime on plant hardiness and yield 
potential . 

5 The biosynthesis of the major plant lipids has been 

the focus of much research (Browse et al., Ann. Rev. 
Plant Physiol. Mol. Biol. (1991) 42:467-506). These 
studies show that, with the notable exception of the 
soluble stearoyl-acyl carrier protein desaturase, the 

10 controlling steps in the production of unsaturated fatty 
acids are largely catalyzed by mertbrane-associated fatty 
acid desaturases. Desaturation reactions occur in 
plastids and in the endoplasmic reticulum using a 
variety of substrates including galactolipids 

15 sulf ©lipids, and phospholipids. Genetic and 

physiological analyses of ftrnnidopsis ttifll ia na nuclear 
mutants defective in various fatty acid desaturation 
reactions indicates that most of these reactions are 
catalyzed by enzymes encoded at single genetic loci in 

20 the plant. The analyses show further that the different 
defects in fatty acid desaturation can have profound and 
different effects on the ultra-structural morphology, 
cold sensitivity, and photo synthetic capacity of the 
plants (Ohlrogge, et al., Biochim. Biophys. Acta (1991) 

25 1082:1-26). However, biochemical characterization of 
the desaturase reactions has been meager. The 
instability of the enzymea and the intractability of 
their proper assay has largely limited researchers to 
investigations of enzyme activities in crude membrane 

30 preparations. These investigations have, however,, 
demonstrated the role of •delta-12 desaturase and 
delta-15 desaturase activities in the production of 
linoleate and linolenate from 2-oleoyl-phosphatidyl- 
choline and 2-linoleoyl-phosphatidylcholiner 

35 respectively (Wang et al.. Plant Physiol. Biochem. 
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(1988) 26:777-792). Thus, modification of the 
activities of these enzymes represents an attractive 
target for altering the levels of lipid unsaturation by 
genetic engineering. 
5 Genes from plants for stearoyl-acyl carrier protein 

desaturase, the only soluble fatty acid desaturase 
known, have been described (Thompson, et al., Proc. 
Natl. Acad. Sci. U.S.A. (1991) 88:2578-2582; Shanklin et 
al., Proc. Natl. Acad. Sci. USA (1991) 88:2510-2514). 

10 Stearoyl-coenzyme-A desaturase genes from yeast, rat, 
and mice have also been described (Stukey, et al., J. 
Biol. Chem. (1990) 265:20144-20149; Thiede, et al., J. 
Biol, Chem. (1986) 261:13230-13235; Kaestner, et al., J. 
Biol. Chem. (1989) 264:14755-1476). No evidence exists 

15 in the public art that describes the isolation of fatty 
acid desaturases other than stearoyl-ACP desaturases 
from higher plants or their corresponding genes. A 
fatty acid desaturase gene from the cyanobacterium, 
Syneghocystig PCC 6803, has also been described (Wada, 

20 et al.. Nature (1990) 347:200-203). This gene encodes a 
fatty acid desaturase, designated dfiS. A, that catalyzes 
the conversion of oleic acid at the 1 position of 
galactolipids to linoleic acid. However, these genes 
have not proven useful for isolating plant fatty acid 

25 desaturases other than stearoyl-ACP desaturase via 

sequence-dependent protocols, and the present art does 
not indicate how to obtain plant fatty acid desaturases 
other than stearoyl-ACP desaturases or how to obtain 
fatty acid desaturase-related enzymes. Thus, the 

30 present art does not teach how to obtain glycerolipid 
desaturases from plants." Furthermore, there is no 
evidence that a method to control the nature and levels 
of unsaturated fatty acids in plants using nucleic acids 
encoding fatty acid desaturases other than stearoyl-ACP 

35 desaturase is known in the art. 
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The biosynthesis of the minor plant lipids has been 
less well studied. While hundreds of different fatty 
acids have been found, xnany from the plant kingdom, only 
a tiny fraction of all plants have been surveyed for 
5 their lipid content (Gvmstone, et al., Eds., (1986) The 
Lipids Handbook, Chapman and Hall Ltd., Cambridge). 
Accordingly, little is known about the biosynthesis of 
these unusual fatty acids and fatty acid derivatives . 
Interesting chemical features found in such fatty acids 

10 include, for example, allenic and conjugated double 
bonds, acetylenic bonds, tiaoa. double bonds, multiple 
double bonds, and single double bonds in a wide number 
of positions and configurations along the fatty acid 
chain. Similarly, many of the structural modifications 

15 found in unusual lipids (e.g., hydroxylation, 

epoxidation, cyclization, etc.) are probably produced 
via further metabolism following chemical activation of 
the fatty acid by desaturation or they involve a 
chemical reaction that is mechanistically similar to 

20 desaturation. For example, evidence for the mechanism 
of hydroxylation of fatty acids being part of a general 
mechanism of enzyme-catalyzed desaturation in eukaryotes 
has been obtained by substituting a sulfur atom in the 
place of carbon at the delta-9 position of stearic acid. 

25 When incubated with yeast cell extracts the thiostearate 
was converted to a 9-sulf oxide (Buist et al. (1987) 
Tetrahedron Letters 28 : 857-8 6a) . This sulfoxidation was 
specific for sulfur at the delta-9 position and did not 
occur in a yeast delta-9-desaturase def icient mutant 

30 (Buist S M&recak (1991) Tetrahedron Letters 32:891-894) . 
The 9-sulfoxide is the sulfur analogue of 9-hydroxyocta- 
decastearate, the proposed intermediate of stearate 
desaturation. Thus fatty-acid desaturase cDNAs may 
serve as useful probes for cDNAs encoding fatty-acid 

35 hydroxylases and other cDNAs idiich encode enzymes with 
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reaction mechanisms similar to fatty-acid desaturatlon. 
Many of these fatty acids and derivatives having such 
features within their structure could prove commercially 
useful if an agronomically viable species could be 
5 induced to synthesize them by introduction of a gene 
encoding the appropriate desaturase. 

SnMMARY OF THE INVENTION 

Applicants have discovered a means to control the 
nature and levels of unsaturated fatty acids in plants. 

10 Nucleic acid fragments from glycerollpld desaturase 

cDNAs or genes are used to create chimeric genes. The 
chimeric genes may be used to transform various plants 
to modify the fatty acid composition of the plant or the 
oil produced by the plant. More specifically, one 

15 embodiment of the Invention is an isolated nucleic acid 
fragment comprising a nucleotide sequence encoding a 
plant delta-15 fatty acid desaturase or a fatty acid 
desaturase-related enzyme with an amino acid identity of 
50%, 65%, 90% or greater to the polypeptide encoded by 

20 SEQ ID N0S:1, 4, 6, 8/ 10, 12, 14, or 16. The Isolated 
fragment in these embodiments is isolated froma plant 
selected from the group consisting of soybean, oilseed 
Brassica species, Arabidopsis thaliana and corn. 

Another embodiment of this invention Involves the 

25 use of these nucleic acid fragments in sequence- 
dependent protocols. Examples Include use of the 
fragments as hybridization probes to Isolate other 
glycerollpld desaturase CDNAs or genes. A related 
embodiment Involves using the disclosed sequences for 

30 amplification of DNA fragments encoding other glycero- 
llpld desaturases . 

Toother aspect of this invention involves chimeric 
genes capable of causing altered levels of the llnolenic 
acid in a transformed plant cell, the gene comprising 

35 nucleic acid fragments encoding encoding a plant 
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delta-15 fatty acid desaturase or a fatty acid 
desaturase-related enzyme with an amino acid identity of 
5a%r 65%, 90% or greater to the polypeptide encoded by 
SEQ ID N0S:1, 4, 6, 8, 10, 12, 14, or 16 operably linked 
5 in suitable orientation to suitable regulatory 

sequences. Preferred are those chimeric genes which 
incorporate nucleic acid fragments encoding delta-15 
fatty acid desaturase cDNAs or genes. Plants and oil 
from seeds of plants containing the chimeric genes 

10 described are also claimed. 

Yet another einbodiment of the invention involves a 
method of producing seed oil containing altered levels 
of linolenic (18:3) acid comprising: (a) transforming a 
plant cell with a chimeric gene described above; (b) 

15 growing fertile plants from the transformed plant cells 
of step (a); (c) screening progeny seeds from the 
fertile plants of step (b) for the desired levels of 
linolenic (18:3) acid, and (d) processing the progeny 
seed of step (c) to obtain seed oil containing altered 

20 levels of the unsaturated fatty acids. Preferred plant 
cells and oils are derived from soybean, rapeseed, 
sunflower, cotton, cocoa, peanut, saf flower, coconut, 
flax, oil palm, and com. Preferred methods of 
transforming such plant cells would include the use of 

25 Ti and Ri plasmids of i^<n^T.»>>ao-i-^T-ium. electroporation, 
and high-velocity ballistic bwobardment . 

The invention also is einbodied in a method of 
breeding plant species to obtain altered levels of poly- 
unsaturated fatty acids, specifically linolenic (18:3) 

30 acid in seed oil of oil-producing plants. This method 
involves (a) making a cross between two varieties of an 
oilseed plant differing in the linolenic acid trait; (b) 
making a Southern blot of restriction enzyme digested 
genomic DNA isolated from several progeny plants 

35 resulting from the cross of step (a) ; and (c) 
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hybridizing the Southern blot with the radiolabeled 
nucleic acid fragments encoding the claimed glycerolipid 
desaturases. 

The invention is also embodied in a method of RFLP 
5 mapping that uses the isolated Arabidopsis thaliana 
delta-15 desaturase sequences described herein. 

The invention is also embodied in plants capable of 
producing altered levels of glycerolipid desaturase by 
virtue of containing the chimeric genes described 

10 herein. Further, the invention is embodied by seed oil 
obtained from such plants. 

The invention is also embodied in a method of RFLP 
mapping ina genomic RFLP marker comprising (a) making a 
cross between two varieties of plants; (b) making a 

15 Southern blot of restriction enzyme digested genomic DNA 
isolated from several progeny plants resulting from the 
cross of step (a) ; and (c) hybridizing the Southern blot 
with a radiolabelled nucleic acid fragments of the 
claimed fragments. 

20 The invention is also embodied in a method to 

isolate nucleic acid fragments encoding fatty acid 
desaturases and fatty acid desaturase-related enzymes, 
comprising (a) comparing SEQ ID N0S:2, 5, 1, S, 11, 13, 
15 and 17 with other fatty acid desaturase polypeptide 

25 sequences; (b) identifying the conserved sequence (s) of 
4 or more amino acids obtained in step a; (c) making 
region-specific nucleotide probe (s) or oligomer (s) based 
on the conserved sequences identified in step b; and d) 
using the nucleotide probe (s) or oligomers (s) of step c 

30 to isolate sequences encoding fatty acid desaturases and 
fatty-acid desaturase-related enzymes by sequence- 
dependent protocols . The product of the method of 
isolation method described is also part of the 
invention. 
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THRTTgP pp ;fffPTPTTnw OP THR ggnnTgwrTT. nPrsrwrPTTOWS 
The invention can be more fully understood from the 
following detailed description and the Sequence 
Descriptions which form a part of this application. The 
5 Sequence Descriptions contain the one letter code for 
nucleotide sequence characters and the three letter code 
for amino acids in conformity with the lOPAC-IUB 
standard described in Nucleic Acids Research 
13:3021-3030 (19085) and 37 C.F,R. 1.822 which are 

10 incorporated herein by reference. 

SEQ ID NO:l shows the conqalete 5* to 3' nucleotide 
sequence of 1350 base pairs of the ftmMriopsis cDNA 
which encodes delta-15 desaturase in plasmid pCF3. 
Nucleotides 46 to 48 are the putative initiation codon 

15 of the open reading frame (nucleotides 46 to 1206) . 
Nucleotides 1204 to 1206 are the termination codon. 
Nucleotides 1 to 45 and 1207 to 1350 are the 5' and 3' 
untranslated nucleotides, respectively. The 386 amino 
acid protein sequence in SEQ ID N0:1 is that deduced 

20 farom the open reading frame. 

SEQ ID NO! 2 is the deduced peptide of the open- 
reading frame of SEQ ID NO:l. 

SEQ ID NO: 3 is a partial nucleotide sequence of the 
^T-a>.irinpsis genomic DNA insert in plasmid pFl which 

25 shows the genomic sequence in the region of the 

p ^ T-ah.-{rior>fiiR genome that encodes delta-15 desaturase. 
Nucleotides 68-255 are identical to nucleotides 1-188 of 
SEQ ID N0:1. Nucleotides 47 to 49 and 56 to 58 are 
termination codons in the same reading frame as the open 

30 reading frame in SEQ ID N0:1. 

SEQ ID NO: 4 shows the 5' to 3' nucleotide sequence 
of the insert in plasmid pACF2-2 of 1525 base pairs of 
the a,.^»<rtnT,sts i-h;,ltana cDNA that encodes a plastid 
delta-15 fatty acid desaturase. Nucleotides 10-12 and 

35 nucleotides 1348 to 1350 are, respectively, the putative 



wo 93/11245 



11 



PCr/US92/10284 



initiation codon and the termination codon of the open 
reading frame (nucleotides 10 to 1350) . Nucleotides 1 to 
9 and 1351 to 1525 are^ respectively, the 5' and 3' 
untranslated nucleotides. 
5 SEQ ID NO: 5 is the deduced peptide of the open 

reading frame of SEQ ID NO: 4. 

SEQ ID NO: 6 shows the complete 5' to 3* nucleotide 
sequence of 1336 base pairs of the Brassica na pus seed 
cDNA, found in plasmid pBNSF3-2, which encodes a 

10 microsomal delta-15 glycerolipid desaturase. 

Nucleotides 79 to 82 are the putative initiation codon 
of the open reading frame (nucleotides 79 to 1212) . 
Nucleotides 1210 to 1212 are the termination codon. 
Nucleotides 1 to 78 and 1213 to 1336 are the 5' and 3' 

15 unstranslated nucleotides respectively - 

SEQ ID NO: 7 is the deduced peptide of the open 
reading frame of SEQ ID NO: 6. 

SEQ ID NO: 8 is the conplete 5' to 3' nucleotide 
sequence of 1416 base pairs of the Brassica napus seed 

20 cDNA found in plasmid pBNSFd-2 which encodes a plastid 
delta-15 glycerolipid desaturase. Nucleotides 1 to 1215 
correspond to a continuous open reading frame of 404 
amino acids. Nucleotides 1213 to 1215 are the 
termination codon. Nucleotides 1215 to 1416 are the 3^ 

25 untranslated nucleotides. 

SEQ ID NO: 9 is the deduced peptide of the open 
reading frame of SEQ ID NO: 8. 

SEQ ID NO: 10 is the conqplete nucleotide sequence of 
the soybean ( glycine max ) microsomal delta-15 desaturase 

30 cDNA^ found in plasmid pXFl, which the 2184 nucleotides 
of this secjuence contain -both the coding sequence and 
the 5' and 3* non-translated regions of the cDNA. 
Nucleotides 855 to 857 are the putative initiation codon 
of the open reading frame (nucleotides 855 to 2000) . 

35 Nucleotides 1995 to 1997 are the termination codon. 
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Nucleotides 1 to 854 and 1998 to 21S4 are the 5' and 3' 
unstranslated nucl otides respectively. The 3«0 amino 
acid protein sequence in SEQ ID NO: 7 is that deduced 
£roni the open reading frame. 
5 SEQ ID NO: 11 is the deduced peptide of the open 

reading frame in SEQ ID NO: 10. 

SEQ ID NO: 12 is the complete 5' to 3' nucleotide 
sequence of 1676 base pairs of the soybean <fi l yctne max) 
seed cDNA found in plasmid pSFD-llSbwp which encodes a 

10 soybean plastid delta-15 desaturase. Nucleotides 169 to 
1530 correspond to a continuous open reading frame of 
453 amino acids. Nucleotides 169 to 171 are the 
putative initiation codon of the open reading frame. 
Nucleotides 1528 to 1530 are the termination codon. 

15 Nucleotides 1531 to 167 6 are the 3- untranslated 

nucleotides. Nucleotides 169 to 382 encode the putative 
plastid transit peptide, based on comparison of the 
deduced peptide with the soybean microsomal delta-15 
peptide . 

20 SEQ ID NO: 13 is the deduced peptide of the open 

reading frame in SEQ ID NO: 12. 

SEQ ID NO: 14 is the complete nucleotide sequence of 
a 396 bp polymerase chain reaction product derived from 
com seed mRNA that is found in the insert of plasmid 

25 PPCR20. Nucleotides 1 to 31 and 364 to 396 correspond 
to the aniplification primers described in SEQ ID NO: 18 
and SEQ ID NO: 19, respectively. Nucleotides 31 to 363 
encode an internal region of a com seed delta-15 
desaturase that is 61.9% identical to the region between 

30 amino acids 137 and 249 of the Brassica napus delta-15 
desaturase peptide sequence shown in SEQ ID NO: 7. 

SEQ ID NO: 15 is the deduced amino acid sequence of 

SEQ ID NO: 14. 

SEQ ID NO: 16 shows the partial composite 5» to 3' 
35 nucleotide sequence of 472 bp derived from the inserts 



wo 93/1 1245 



13 



PCT/US92/10284 



in plasmlds pFadx*2 and pYacp7 for AraV^^dnpais thaliana 
cDNA that encodes a plastld delta-15 fatty acid 
desaturase. Nucleotides 2-4 and nucleotides 468 to 470 
are, respectively, the first and the last codons in the 
5 open reading frame, 

SEQ ID NO: 17 is deduced partial peptide sequence of 
the open reading frame in SEQ ID NO: 16. 

SEQ ID NO: 18 One hundred and twenty eight fold 
degenerate sense 31-mer PGR primer. Nucleotides 1 to 8 
10 correspond to the Bam HI restriction enzyme recognition 
sequence. Nucleotides 9 to 137 correspond to amino acid 
residues 130 to 137 of SEQ ID NO: 6 with a deoxyinosine 
base at nucleotide 11. 

SEQ ID NO: 19 Two thousand and forty eight-fold 
15 degenerate antisense 35-mer PGR primer. Nucleotides 1 
to 8 correspond to the Bam HI restriction enzyme 
recognition sequence. Nucleotides 9 to 35 correspond to 
amino acid residues 249 to 256 of SEQ ID NO: 6 with a 
deoxyinosine base at nucleotide 15. 
20 SEQ ID NO: 20 Sixteen-fold degenerate sense 36-mers 

made to amino acid residues 97-108 in SEQ ID NO: 2. 

SEQ ID NO: 21 Sixteen-fold degenerate sense 36-mers 
made to amino acid residues 97-108 in SEQ ID N0:2. 

SEQ ID NO: 22 Seventy two-fold degenerate sense 
25 18-mers made to amino acid residues 100-105 in SEQ ID 
N0:2. 

SEQ ID NO: 23 Seventy two- fold degenerate sense 
18-mers made to amino acid residues 100-105 in SEQ ID 
N0:2. 

30 SEQ ID NO: 24 Seventy two-fold degenerate antisense 

18-mers made to amino acid residues 299-304 in SEQ ID 
N0:2 . 

SEQ ID NO: 25 Seventy two- fold degenerate antisense 
18-mers made to amino acid residues 299-304 in SEQ ID 
35 N0:2. 
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SEQ ID NO: 26 S venty two-fold degenerate antisense 
le-mers made to amino acid residues 304-309 in SEQ ID 
N0:2. 

SEQ ID NO: 27 Seventy two-fold degenerate antisense 
5 18-mers made to amino acid residues 304-309 in SEQ ID 
NO:2. 

SEQ ID NO: 28 Sixteen-fold degenerate sense 36-mers 
made to amino acid residues 97-108 in SEQ ID N0:2. 

SEQ ID NO: 29 Sixteen-fold degenerate sense 36-mers 
10 made to amino acid residues 97-108 in SEQ ID N0:2. 

SEQ ID NO: 30 Sixty four-fold degenerate antisense 
38-mers made to amino acid residues 299-311 in SEQ ID 
NO:2. 

SEQ ID NO: 31 Sixty four-fold degenerate antisense 
15 38-mers made to amino acid residues 299-311 in SEQ ID 
NO:2. 

SEQ ID NO: 32 A 135-mer made as an ailtisense strand 
to amino acid residues 97-141 in SEQ ID NO: 2. 

pgTaTT.gn nesf PTPTTnw nv twr twvbntton 

20 Applicants have isolated nucleic acid fragments 

that encode plant fatty aci4 desaturases and that are 
useful in modifying fatty acid composition in oil- 
producing species by transformation. 

Thus, transfer of the nucleic acid fragments of the 

25 invention or a part thereof that encodes a functional 
enzyme, along with suitable regulatory sequences that 
direct the transciption of their bsRIB^, into a living 
cell will result in the production or over-production of 
plant fatty acid desaturases and will result in 

30 increased levels of unsaturated fatty acids in cellular 
lipids, including triacylglycerols . 

Transfer of the nucleic acid fragments of the 
invention or a part thereof, along with suitable 
regulatory sequences that direct the transciption of 

35 their antisense RNA, into plants will result in the 
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inhibition of expression of the endogenous fatty acid 
desaturase that is substantially homologous with the 
transferred nucleic acid fragment and will result in 
decreased levels of unsaturated fatty acids in cellular 
5 lipids^ including triacylglycerols . 

Transfer of the nucleic acid fragments of the 
invention or a part thereof, along with suitable 
regulatory sequences that direct the transciption of 
their mRNA, into plants may result in inhibition by 

10 cosuppression of the expression of the endogenous fatty 
acid desaturase gene that is substantially homologous 
with the transferred nucleic acid fragment and may 
result in decreased levels of unsaturated fatty acids in 
cellular lipids, including triacylglycerols. 

15 The nucleic acid fragments of the invention can 

also be used as restriction fragment length polymorphism 
(RFLP) markers in Arahidnpsls genetic mapping and plant 
breeding programs. 

The nucleic acid fragments of the invention or 

20 oligomers derived therefrom can also be used to isolate 
other related glycerolipid desaturase genes using DNA, 
RNA, or a library of cloned nucleotide sequences from 
the same or different species by well known sequence- 
dependent protocols, including, for example, methods of 

25 nucleic acid hybridization and amplification by the 
polymerase chain reaction. 

Pfifinitigns 

In the context of this disclosure, a number of 
terms shall be used. The term •'fatty acid desaturase" 

30 used herein refers to an enzyme which catalyzes the 

breakage of a carbon-hydrogen bond and the introduction 
of a carbon-carbon double bond into a fatty acid 
molecule. The fatty acid may be free or esterified to 
another molecule including, but not limited to, acyl- 

35 carrier protein, coenzyme A, sterols and the glycerol 
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moiety of glycerolipids . The term "glycerolipid 
desaturases" used herein ref rs to a subset of the fatty 
acid desaturases that act on fatty acyl moieties 
esterified to a glycerol backbone. "Delta-12 
5 desaturase" refers to a fatty acid desaturase that 

catalyzes the formation of a double bond between carbon 
positions 6 and 7 (numbered from the methyl end), (i^e,, 
those that correspond to' carbon positions 12 and 13 
(numbered from the carbonyl carbon) of an 18 carbon-long 

10 fatty acyl chain or carbon positions 10 and 11 (numbered 
from the carbonyl carbon) of a 16 carbon-long fatty acyl 
chain) . "Delta-15 desaturase" refers to a fatty acid 
desaturase that catalyzes the formation of a double bond 
between carbon positions 3 and 4 (nunibered from the 

15 methyl end), (i.e., those that correspond to carbon 

positions 15 and 16 (numbered from the carbonyl carbon) 
of an 18 carbon- long fatty acyl chain and carbon 
positions 13 and 14 (numbered from the carbonyl carbon) 
of a 16 carbon-long fatty acyl chain) . Exaniples of 

20 fatty acid desaturases include, but are not limited to, 
the microsomal delta-12 and deita-15 desaturases that 
act on phosphatidylcholine lipid substrates; the 
chloroplastic delta-12 and delta-15 desaturases that act 
on phosphatidyl glycerol and galactolipids; and other 

25 desaturases that act on such fatty acid substrates such 
as phospholipids, galactolipids, and sulf olipids • 
••Microsomal desaturase •• refers to the cytoplasmic 
location of the enzyme, while -chloroplast desaturase" 
and "plastid desaturase" refer to the plastid location 

30 of the enzyme. These fatty acid desaturases may be 
found in a variety of organisms including, but not 
limited to, higher plants, diatoms, and various 
eukaryotic and prokaryotic microorganisms such as fungi 
and photosynthetic bacteria and algae. The term 

35 "homologous fatty acid desaturases" refers to fatty acid 
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desat:urases that catalyze the same desaturatlon on the 
same lipid substrate. Thus, microsomal delta-15 
desaturases, even from different plant species, are 
homologous fatty acid desaturases . The term 
5 "heterologous fatty acid desaturases" refers to fatty 
acid desaturases that catalyze desaturations at 
different positions and/or on different lipid 
substrates • Thus, for example, microsomal delta-12 and 
delta-15 desaturases, which act on phosphatidylcholine 

10 lipids, are heterologous fatty acid desaturases, even 
when from the same plant. Similarly, microsomal 
delta-15 desaturase, which acts on phosphatidylcholine 
lipids, and chloroplast delta-15 desaturase, which acts 
on galactolipids, are heterologous fatty acid 

15 desaturases, even when from the same plant. It should 
be noted that these fatty acid desaturases have never 
been isolated and characterized as proteins. 
Accordingly the terms such as '•delta-12 desaturase** and 
**delta-15 desaturase** are used as a convenience to 

20 describe the proteins encoded by nucleic acid fragments 
that have been isolated based on the phenotypic effects 
caused by their disruption. The term "fatty acid 
desaturase-related enzyme" refers to enzymes whose 
catalytic product may not be a carbon-carbon double bond 

25 but whose mechanism of action is similar to that of a 
fatty acid desaturase (that is, catalysis of the 
displacement of a carbon-hydrogen bond of a fatty acid 
chain to form a fatty-hydroxyacyl intermediate or end- 
product) . This term is different from "related fatty 

30 acid desaturases", which refers to structural 
similarities between fatty acid desaturases. 

The term "nucleic acid" refers to a large molecule 
which can be single-stranded or double-stranded, 
conposed of monomers (nucleotides) containing a sugar, a 

35 phosphate and either a purine or pyrimidine. A "nucleic 
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acid fragment" is a fraction of a given nucleic acid 
molecule. In higher plants, deoxyribonucleic acid (DNA) 
is the genetic material while ribonucleic acid (RNA) is 
involved in the transfer of the information in DNA into 
5 proteins. A "genome" is the entire body of genetic 
material contained in each cell of an organism. The 
term "nucleotide sequence" refers to the sequence of DNA 
or RNA polymers, which can be single- or double- 
stranded, optionally containing synthetic, non-natural 

10 or altered nucleotide bases capable of Incorporation 

into DNA or RNA polymers. The term "oligomer" refers to 
short nucleotide sequences, usually up to 150 bases 
long. "Region-specific nucleotide probes" refers to 
isolated nucleic acid fragments derived from a cDNA or 

15 gene using a knowledge of the amino acid regions 

conserved between different fatty-acid desaturases which 
may be used to isolate cDNAS or genes for other fatty- 
acid desaturases or fatty acid desaturase-related 
enzymes using sequence dependent protocols. As used 

20 herein, the term "homologous to" refers to the 

relatedness between the nucleotide sequence of two 
nucleic acid molecules or between the amino acid 
sequences of two protein molecules. Estimates of such 
homology are provided by either DNA-DNA or DNA-Rm 

25 hybridization vmder conditions of stringency as is well 
understood by those skilled in the art (Names and 
Biggins, Eds. (1985) Nucleic Acid Hybridisation, IRL 
Press, Oxford, U.K.); or by the comparison of sequence 
similarity between two nucleic acids or proteins, such 

30 as by the method of Needleman et al. (J. Mol. Biol. 
(1970) 48:443-453). As used herein, "sTobstantially 
homologous" refers to nucleotide sequences that have 
more than 90% overall identity at the nucleotide l^vel 
with the coding region of the claimed sequence, such as 

35 genes and pseudo-genes corresponding to the coding 
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regions. The nucleic acid fragments described herein 
include molecules which comprise possible variations, 
both man-made and natural, such as but not limited to 
(a) those that involve base changes that do not cause a 
5 change in an encoded amino acid, or (b) which involve 
base changes that alter an amino acid but do not affect 
the functional properties of the protein encoded by the 
DNA sequence, (c) those derived from deletions, 
rearrangements, azaplif ications, random or controlled 

10 mutagenesis of the nucleic acid fragment, and (d) even 
occasional nucleotide sequencing errors. 

"Gene" refers to a nucleic acid fragment that 
expresses a specific protein, including regulatory 
sequences preceding (5' non-coding) and following (3' 

15 non-coding) the coding region. "Fatty acid desaturase 
gene" refers to a nucleic acid fragment that expresses a 
protein with fatty acid desaturase activity. "Native" 
gene refers to an isolated gene with its own regulatory 
sequences as found in nature. "Chimeric gene" refers to 

20 a gene that comprises heterogeneous regulatory and 

coding sequences not found in nature. "Endogenous" gene 
refers to the native gene normally found in its natural 
location in the genome and is not isolated. A "foreign" 
gene refers to a gene not normally found in the host 

25 organism but that is instead introduced by gene 

transfer, "Pseudo-gene" refers to a genomic nucleotide 
sequence that does not encode a functional enzyme. 

"Coding sequence" refers to a DNA sequence that 
codes for a specific protein and excludes the non-coding 

30 sequences. It may constitute an "uninterrupted coding 

sequence", i.e., lacking -an intron or it may include one 
or more introns bounded by appropriate splice junctions. 
An "intron" is a nucleotide sequence that is transcribed 
in the primary transcript but that is removed through 

35 cleavage and re-ligatlon of the RNA within the cell to 
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create the mature mRNA tliat can be translated into a 
protein . 

"Initiation codon" and "termination codon" refer to 
a unit of three adjacent nucleotides in a coding 
5 sequence that specifies initiation and chain termination 
respectively, of protein synthesis (mRNA translation) . 
"Open reading frame" refers to the coding sequence 
uninterrupted by introns between initiation and 
termination codons that encodes an amino acid sequence. 

10 "RNA transcript" refers to the product resulting 

from RNA polymerase-catalyzed transcription of a DNA 
sequence. When the RNA transcript is a perfect 
complementary copy of the DNA sequence, it is referred 
to as the primary transcript or it may be a RNA sequence 

15 derived from posttranscriptional processing of the 

primary transcript and is referred to as the mature RNA. 
"Messenger RNA (mRNA)" refers to the RNA that is without 
introns and that can be translated into protein by the 
cell. "cDNA" refers to a double-stranded DNA that is 

20 complementary to and derived from mRNA. "Sense" RNA 
refers to RNA transcript that includes the mRNA. 
"Antisense RNA" refers to a RNA transcript that is 
con?>lementary to all or part of a target primary 
transcript or mRNA and that blocks the expression of a 

25 target gene by interfering with the process ingr 

transport and/or translation af its primary transcript 
or mRNA. The coniplementarity of an antisense RNA may be 
with any part of the specific gene transcript, i.e., at 
the 5» non-coding sequence, 3' non-coding sequence, 

30 introns, or the coding sequence. In addition, as used 
herein, antisense RNA may contain regions of ribozyme 
sequences that increase the efficacy of antisense RNA to 
block gene expression. "Ribozyme" refers to a catalytic 
RNA and includes sequence-specific endoribonudeases . 
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As used herein, "suitable regulatory sequences" 
refer to nucleotide sequences in native or chimeric 
genes that are located upstream (5')f within, and/or 
downstream (3*) to the nucleic acid fragments of the 
5 invention, which control the expression of the nucleic 
acid fragments of the invention. The term "expression", 
as used herein, refers to the transcription and stable 
accumulation of the sense (mRNA) or the antlsense RNA 
derived from the nucleic acid fragment (s) of the 

10 invention that, in conjunction with the protein 

apparatus of the cell, results in altered levels of the 
fatty acid desaturase <s) • Expression or overexpression 
of the gene involves transcription of the gene and 
translation of the mRNA into precursor or mature fatty 

15 acid desaturase proteins. "Antlsense inhibition" refers 
to the production of antlsense RNA transcripts capable 
of preventing the expression of the target protein. 
"Overexpression" refers to the production of a gene 
product in transgenic organisms that exceeds levels of 

20 production in normal or non-transformed organisms. 

"Cosuppression" refers to the expression of a foreign 
gene which has substantial homology to an endogenous 
gene resulting in the suppression of expression of both 
the foreign and the endogenous gene. "Altered levels" 

25 refers to the production of gene product (s) in 

transgenic organisms in amounts or proportions that 
differ from that of normal or non-transformed organisms. 

"Promoter" refers to a DNA sequence in a gene, 
usually upstream (5') to its coding sequence, which 

30 controls the expression of the coding sequence by 

providing the recognition for RNA polymerase and other 
factors required for proper transcription. In 
artificial DNA constructs promoters can also be used to 
transcribe antlsense RNA. Promoters may also contain 

35 DNA sequences that are involved in the binding of 
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protein factors which control the effectiveness of 
trauiscription initiation in response to physiological or 
developmental conditions. It may also contain enhancer 
elements. An "enhancer" is a DNA sequence which can 
5 stimulate promoter activity. It may be an innate 
element of the promoter or a heterologous element 
inserted to enhance the level and/or tissue-specificity 
of a promoter. "Constitutive promoters" refers to those 
that direct gene expression in all tissues and at all 

10 times. "Tissue-specific" or "development-specific" 
promoters as referred to herein are those that direct 
gene expression almost exclusively in specific tissues, 
such as leaves or seeds, or at specific development 
stages in a tissue, such as in early or late embryo- 

15 genesis, respectively. 

The "3* non-coding sequences" refers to the DNA 
sequence portion of a gene that contains a poly- 
adenylation signal and any other regulatory signal 
capable of affecting mRNA processing or gene expression. 

20 The polyadenylation signal is usually characterized by 
affecting the addition of polyadenylic acid tracts to 
the 3' end of the n»RNA precursor. 

The term "Transit Peptide" refers to the N-terminal 
extension of a protein that serves as a signal for 

25 uptake and transport of that protein into an organelle 
such as a plastid or mitochondrion. 

"Transformation" herein refers to the transfer of a 
foreign gene into the genome of a host organism and its 
genetically stable inheritance. "Restriction fragment 

30 length polymorphism" refers to different sized 

restriction fragment lengths due to altered nucleotide 
sequences in or around variant forms of genes. 
"Fertile" refers to plants that are able to propagate 
sexually . 
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"Oil-producing species" herein refers to plant 
species which produce and store triacylglycerol in 
specific organs, primarily in seeds. Such species 
include soybean ^ Glycine max ) , rapeseed and canola 
5 (including Ryagsiea nacus./ fi. rrflmpftstris) r sunflower 

/ Helianthus annns > , COtton rf;r>gfiypium hirsutum^ , corn 
r zea mays > , COCoa ^ Theobroma c^eao ) , saf flower 

/ narthamus i- 1 nr>torius ^ , oil palm (Elaels guineensls) 9 
coconut palm (Cqcqs xiuciffica) f flax (Li num 

10 . 1VQ^t^a1^ iss^mum ) , castor (fiicinua communia) and peanut 
/ Arachis h ypoqaea ^ . The group also includes non- 
agronomic species which are useful in developing 
appropriate expression vectors such as tobacco, rapid 
cycling Brassica species, and Agflbidopsis tha li anfi f and 

15 wild species which may be a source of unique fatty 
acids 

"Sequence-dependent protocols" refer to techniques 
that rely on a nucleotide sequence for their utility. 
Examples of secpience-dependent protocols include, but 

20 are not limited to, the methods of nucleic acid and 
oligomer hybridization and methods of DNA and RNA 
amplification such as are exemplified in various uses of 
the polymerase chain reaction. "PGR product" refers to 
the DNA product obtained, through polymerase chain 

25 reaction. 

Various solutions used in the experimental 
mauaipulations are referred to by their common names such 
as "SSC", "SSPE", "Denhardt's solution", etc. The 
composition of these solutions may be found by reference 

30 to Appendix B of Sambrook, et al. (Molecular Cloning, A 
Laboratory Manual, 2nd ed. (1989), Cold Spring Harbor 
Laboratory Press) • 
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T-DNA Mutagenesis and Ident.lflcatlon of an 
AT-«K4H»nsis M„farn-- Dgi»«»^i-itr^ In n«it:a-l'> npsatnraf.jQn 
In T-DNA mutagenesis (Feldmann, et al.. Science 

(1989) 243:1351-1354), the integration of T-DNA in the 

5 genome can interrupt normal expression of the gene at or 
near the site of the integration. If the resultant 
mutant phenotype can be detected and shown genetically 
to be tightly linked to the T-DNA insertion, then the 
"tagged" locus and its wild type counterpart can be 
10 readily isolated by molecular cloning by one skilled in 
the art. 

flops is f ha liana seeds were transformed by 
p^^o>^ar,1-«»7-illm i-iim>»faeiens C58Clrif strain harboring the 
avirulent Ti-plasmid pGVSBSO : :pAK1003 that has the T-DNA 

15 region between the left and right T-DNA borders replaced 
by the origin of replication region and anpicillin 
resistance gene of plasmid pBR322, a bacterial kanamycin 
resistance gene, and a plant kananiycin resistance gene 
(Feldmann, et al., Mol. Gen. Genetics (1987) 208:1-9). 

20 Plants from the treated seeds were self-fertilized and 
the resultant progeny seeds, germinated in the presence 
of kanamycin, were self -fertilized to give rise to a 
population, designated T3, that was segregating for 
T-DNA insertions. T3 seeds from approximately 6000 T2 

25 plants were analyzed for fatty acid composition. One 
line, designated 3707, showed a reduced level of 
linolenic acid (18:3) . One more round of self- 
fertilization of mutant line 3707 produced T4 progeny 
seeds. The ratio of 18:2/18:3 in seeds of the 

30 homogyzous mutant in T4 population was ca. 14; this 

ratio is ca 1.8 and ca. 23, respectively, in wild-type 
Arabidopsis and Arabidopsis fad 3 mutant tLemieu^t et al. 

(1990) Theor. App. Gen. 80:234-240 ] obtained via 
chemical mutagenesis. These seeds were planted and 263 

35 individual plants were analyzed for the presence of 
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nopaline in leaf extracts. T5 seeds from these plants 
were furth r analyzed for fatty acid composition and the 
ability to germinate in the presence of kanamycin. The 
mutant fatty acid phenotype was found to segregate in a 
5 1:2:1 ratiO/ as was germinability on kanamycin. 

Nopaline was found in all plants with an altered fatty 
acid phenotype/ but not in wild type segregants. These 
results provided evidence that the locus controlling 
delta-15 desaturation was interrupted by T*DMA in mutant 
10 line 3707. 

Isolation of Arabidopsis Genomic DNA 
Cont^ainino fh^ Gene CQntrollina Delta-15 Desaturation 
In order to isolate the gene controlling delta-15 
desaturation from wild-type Arabidopsis, a T-DNA-plant 

15 DNA "junction" fragment containing a T-DNA border 

integrated into the host plant DNA was isolated from 
Arabldopsis mutant 3707. For this, genomic DNA from the 
mutant plant was isolated and completely digested by 
either Bam HI or Sal I restriction enzymes. In each 

20 case, one of the resultant fragments was expected to 
contain the origin of replication and ampicillin- 
resistance gene of pBR322 as well as the left T-DNA- 
plant DNA junction fragment- Such fragments were 
rescued as plasmids by ligating the digested genomic DNA 

25 fragments at a dilute concentration to facilitate self- 
llgation and then using the ligated fragments to 
transform £. coli cells. Ampicillin-resistant £. coli 
transformants were isolated and screened by colony 
hybridization to fragments containing either the left or 

30 the right T-DNA border. Of the 192 colonies obtained 
from the plasmid rescue of Sal I digested genomic DNA, 
31 hybridized with the left T-DNA border fragment, 4 
hybridized to the right T-DNA border fragment, and none 
hybridized to both. Of the 85 colonies obtained from 

35 the plasmid rescue of Bam HI digested genomic DNA, 63 
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hybridized to the left border and none to the right 
border. Restriction analysis of seven rescued plasmids 
that were obtained from the Bam HI digestion and that 
hybridized to the left T-DNA border showed that they 
were indistinguishable and contained 1.4 Jcb of putative, 
flanking plant DNA. Restriction analysis of another 
rescued plasmid, pSl, that was Obtained from the Sal I 
digestion and hybridized only to the left T-DHA border, 
showed that it contained 2.9 Icb of putative, flanking 
plant DNA. This flanking DNA had a Bam HI site and a 
Hind III site 1.4 kb and 2.2 kb, respectively, away from 
the left T-DNA border, suggesting that the 1.4 kb 
putative plant DNA in Bam HI rescued plasmids was 
contained within the 2.9 kb putative plant DNA in the 
15 sal I rescued plasmids . Southern blot analysis of wild 
type and mutant 3707 flrnhi fiopsig genomic DNA using the 
radiolabeled 1.4 kb DNA fragment as the hybridization 
probe confirmed that this fragment contained plant DNA 
and that the T-DNA integration site was in a 2.8 kb 
Bam HI, a 5.2 kb Hind III, a 3.5 kb Sal I, a 5.5 kb 
ECO RI, and an approximately 9 kb Cla I fragment of wild 
type fi^«K<rt»«sis DNA. Nucleotide sequencing of plasmid 
pSl with a primer made to a left T-DNA border sequence 
revealed that pSl was colinear with the sequence of the 
left T-DNA border (Yadav et al., Proc. Natl. Acad. Sci. 
USA (1982) 79:6322-6326) up to nucleotide position 65, 
which is in the T-DNA border repeats. J^roximately 800 
bp of additional sequence in pSl beyond the T-DNA-plant 
DNA junction, that is, in the plant DNA adjoining the 
left T-DNA border, showed no significant homology to the 
T-DNA of pGV3850::pAK1003 and no significant open 

reading frame. 

The nucleic acid fragment from wild-type 
ar-«Hirtonsls corresponding to the plant DNA flanking 
35 T-DNA in the line 3707 was isolated by screening a 
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lambda phage Agabidopais thaliana genomic library with 
the 1.4 kb plant DNA Isolated from the rescued plasmids 
as a hybridization probe. Seven positively-hybridizing 
genomic clones were isolated that fell in one of five 
5 classes based on partial restriction mapping. While 

their average insert size was approximately 15 Icb, taken 
together they spanned a total of approximately 40 kb of 
genomic DNA. A combination of restriction and Southern 
analyses revealed that the five clones overlapped the 

10 site of integration of the left border of the T-DNA and 
that there was no detectable rearrangement of plant DNA 
in the rescued plasmids as compared to that in the wild 
type genomic plant DNA, One of these lambda phage 
clones, designated 1111, was representative of the 

15 recovered clones and contained an approximately 20 kb 

genomic DNA insert which was more or less symmetrically 
arranged around the site of insertion of the left border 
of the T-DNA. This clone was deposited on November 27, 
1991 with the American Type Culture Collection of 

20 Rockville, Maryland under the provisions of the Budapest 
Treaty and bears accession number ATCC 75167. 

Isolation of Arabldopsis Delta-15 
Desaturase cDNA 
A 5.2 kb Hind III fragment containing wild-type 

25 genomic DNA, which hybridized to the 1.4 kb flanking 
plant DNA recovered from line 3707 and which was 
interrupted near its middle by the T-DNA insertion in 
line 3707, was Isolated from lambda phage clone 41A1 and 
cloned into the Hind III site of the pBluescript SK 

30 vector (Stratagene) by standard cloning procedures 
described in Sarobrook et-al.. Molecular Cloning, A 
Laboratory Manual, 2nd ed. (1989), Cold Spring Harbor 
Laboratory Press) . The resultant plasmid was designated 
pFl. The isolated 5.2 kb Hind III fragment was also 

35 used as a radiolabeled hybridization probe to screen a 
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cDNA library made to poly »RNA from 3-day-old 
etlolat:ed Ar-aHidonsis thallana (ecotype Colusibla) 
seedling hypocotyls in a lambda ZAP II vector 
(Stratagene) . Of the several positively-hybridising 
5 plaques, four strongly-hybridizing ones were subjected 
to plaque purification. Sequences of the pBluescript 
(Stratagene) vector, including the cDNA inserts, from 
each of the purified phage stocks were excised in the 
presence of a helper phage. The resultant phagemids 

10 were used to infect £. SQlL ceils which yielded double- 
stranded plasmids, pCFl, pCF2, pCF3r and pCP4. All four 
were shown to contain at least one approximately 1.3 1:o 
1.4 kb Not I insert fragment (Not I/Eco RI adaptors were 
used in the preparation of the cDNA library) which 

15 hybridized to the same region of wild-type plant genomic 
DNA present in the isolated phage clones. This region, 
which was near the site of integration of the left T-DNA 
border in line 3707, was on the side of the T-DNA 
insertion opposite to that of the plant DNA flanking the 

20 left T-DNA border isolated previously via plasmid 

rescue. Partial sequence determination of the different 
cDNAs revealed common identity. Since multiple versions 
of only one type of cDNA were obtained from a cDNA 
library made from etiolated tissue which is expected to 

25 express delta-15 desaturation, and since these cDNAs 
hybridized to the genomic DNA that corresponds Ho the 
site of T-DNA integration in line 3707 which had a high 
linoleic acid/low linolenic acid phenotype. Applicants 
were lead to conclude that the T-DNA in line 3707 

30 interrupted the normal expression of the gene encoding 
delta-15 desaturase. Th6 complete nucleotide sequence 
of one CDNA, designated pCF3, was determined and is 
shown as SEQ ID N0:1. It reveals an open reading frame 
that encodes a 386 amino acid polypeptide. One of the 

35 sequencing primers made to the pCF3 insert was also used 
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to obtain 255 bp of sequence from pFl that Is shown as 
SEQ ID NO: 3. Nucleotides 68 to 255 of the genomic DNA 
in pFl (SEQ ID NO: 3) are identical to nucl otides 1 to 
188 of the cDNA (SEQ ID N0:1), which shows that they are 
5 colinear and that the cDNA is encoded for by the gene in 
the isolated genomic DNA. Nucleotides 113 to 115 in SEQ 
ID NO: 3 are the initiation codon of the largest open 
reading frame corresponding to nucleotides 46-48 in SEQ 
ID N0:1. This is evident from the presence of in-frame 

10 termination codons at nucleotides 47 to 49 and 

nucleotides 56 to 58 and the absence of observable 
intron splice junctions in SEQ ID N0:3. The 
identification of the 386 amino acid polypeptide as a 
desaturase was confirmed by comparing its amino acid 

15 sequence with all the protein sequences found in Release 
19,0 of the SWISSPROTEIN database using the FASTA 
algorithm of Pearson and Lipman (Proc. Natl. Acad. Sci. 
USA (1988) 85:2444-2448) and the BLAST program (Altschul 
et al., J. Mdl. Biol. (1990) 215:403-410). The most 

20 homologous protein found in both searches was the desA 
fatty acid desaturase from the cyanobacterium 
Synf^nhQcystis PCC6803 (Wada, et al . , Nature (1990) 
347:200-203; Genbank ID:CSDESA; GenBank Accession 
No:X53508) . The 386 amino acid peptide in SEQ ID N0:1 

25 was also compared to the 351 amino acid sequence of desA 
by the method of Needleman et al. (J. Mol. Biol. (1970) 
48:443-453). Over their entire length, these proteins 
were 26% identical, the comparison imposing four major 
gaps in the desA protein sequence. While this overall 

30 homology is poor, homology in shorter stretches was 
better. For instance, in a stretch of 78 amino acids 
the Ar-abidopais delta-15 desaturase (amino acids 78 to 
155 in SEQ ID N0:1) and the desA protein (amino acids 67 
to 144) showed 40% identity and 66% similarity. 
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Homology in yet shorter stretches was even greater as 
shown in Table 2. 

TABLE 2 

Peptide AA positions AA positions Percent 
T.ftnafh ^r. ggn td wo;i in dftsft Ident i ty 

12 97-108 86-97 83 

7 115-121 104-110 71 

9 133-141 22-130 56 

11 299-309 282-292 64 

These high percent identities in short stretches of 
amino acids between the cyanobacterial desaturase 
5 polypeptide and SEQ ID NO: 2 suggests significant 
relatedness between the two. 

To analyse the developmental expression of the gene 
encoding mRNA coresponding to SEQ ID N0:1, the CDNA 
insert in plasmid pCF3 was used as a radiolabeled 
10 hybridization probe on mRNA samples from leaf r root, 

germinating seedling, and developing siliques from both 
wild type and mutant 3707 Arabidopsis plants, 
essentially as described in Maniatis et al.. Molecular 
Cloning, A Laboratory Manual (1982) Cold Spring Harbor 
15 Laboratory Press. The results indicated that while the 
mRNA corresponding to SEQ ID N0:1 is detected in all 
tissues from the mutant plant, its levels are lower than 
in wild-type tissues. This is consistent with the 
observation that the fatty acid mutation in line 3707 is 
20 leaky relative to the known ^y^y^^Armsts Jlad 3, mutant 
obtained via chemical mutagenesis. These results 
confirmed that the T-DNA in line 3707 had interrupted 
the normal expression of a fatty acid desaturase gene. 
Based on the fatty acid phenotype of homozygous mutant 
25 line 3707, Applicants concluded that the cDNA insert in 
pCF3 encoded the delta-15 desaturase. Further, 
Applicants concluded that it was the microsomal delta-15 
desaturase, and not the chloroplastic delta-15 
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desaturase, since: a) the mutant phenotype was 
expressed strongly in the seed but express d poorly/ if 
at all^ in the leaf of line 3707, and b) the delta-15 
desaturase polypeptide, by comparison to the daa& 
5 polypeptide^ did not have an N-terminal extension of a 
transit peptide expected for a nuclear-encoded 
chloroplast desaturase . 

The identity of SEQ ID NO: 2 as the ftrabidopsis 
microsomal delta-15 desaturase was confirmed by its 

10 biological overexpression in plant tissues. For this^ 

the 1.4 IcB Not I fragment of plasmid pCF3 containing the 
delta-15 desaturase cDNA was placed in the sense 
orientation behind either the CaMV 35S promotor, to 
provide constituitive expression, or behind the promoter 

15 for the gene encoding soybean a' subunit of the 

P-conglycinin (7S) seed storage protein, to provide 
embryo-specific expression. The chimeric genes 35S 
promoter/sense SEQ ID NO: 1/3' nopaline synthase and 
P-conglycinin/sense SEQ ID NO: 1/3' phaseolin were then 

20 transformed into plant cells by ftgrobacterium 

fnme!faciens 's binary Ti plasmid vector system [Hoekema 
et al. (1983) Nature 303:179-180; Sevan (1984) Nucl. 
Acids Res. 12:8711-8720]. 

To confirm the identity of SEQ ID N0:1 and to test 

25 the biological effect of its overexpression in a 
heterologous plant species, the chimeric genes 35S 
promoter/sense SEQ ID NO: 1/3 • nopaline synthase was 
transformed Into a binary vector, which was then 
transferred into i^rrT>oV>flnf^rium tl^nif^faciens strain RIOOO, 

30 carrying the Ri plasmid pRiA4b from ftgrobacterium 

yhigQoenes [Moore et al.- (1979) Plasmid 2:617-626], 
Carrot maucus carota L.) cells were transformed by 
co-cultivation of carrot root disks with strain RIOOO 
carrying the chimeric gene by the method of Petit et al. 

35 (1986) [Mol. Gen. Genet. 202:388-393]. Fatty acid 
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analyses of transgenic carrot "iiairy" roots show that 
overesqpresslon of aT.aH^Hr>g>fitg microsomal d lta-15 
desaturase can result in over 10-fold increase in 18:3 
at the expense of 18:2. 

To conplement the delta-15 desaturation mutation in 
the T-DNA mutant line 3707 and to test the biological 
effect of overexpression of SEQ ID N0:1 in- seed, the 
embryo-specific promoter /SEQ ID NO: 1/3* phaseolin 
chimeric gene was transformed into a binary vectpr, 
which was then transformed into the avirulent 
Agrobacterium strain LBA4404/paL4404 [Hoekema et al. 
(1983) Nature 303:179-180]. Roots of line 3707 were 
transformed by the engineered Rgrol7aCt:erium/ transformed 
plants were selected and grown to give rise to seeds. 
15 Fatty acid analysis of the seeds from two plants showed 
that the one out of six seeds in each plant showed the 
mutant fatty acid phenotype, while the remaining seeds 
show more than 10-fold increase in 18 r3 to ca. 55%. 
While the sanqple size is small, this segregation 
20 suggests Mendelian inheritance of the fatty acid 

phenotype. While most of the increase occurs at the 
expense of 18:2, some of it also occurs at the expense 
of 18:1. Thus, overexpression of this gene in oils 
crops, especially canola, which is a close relative of 
25 ft^aH^rinr»Rls . is also expected to result in the high 
levels of 18:3 that are found in specialty oil of 
linseed. 

Coaiparisons of the sequence of the 386 amino acid 
polypeptide by the method of Needleman et al. (J. Mol. 

30 Biol. (1970) 48:443-453) with those for the microsomal 
stearoyl-CoA <delta-9) ddsaturases from rat, mouse and 
yeast revealed 21%, 19%, and 17% identities, 
respectively. While the membrane-associated ftrflh i dgpsis 
delta-15 desaturase protein showed significant but 

35 limited homology to the si&sL, protein, it showed no 
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significant homology to the soluble stearoyl-ACP 
(delta- 9) desaturases from higher plants, including one 
from Arabidopsis , 

Comparison of partial nucleotide sequences of 
5 plasmids pFl and pSl showed that the left T-DNA 
border :plant DNA junction is ca. 700 bp from the 
initiaton codon in SEQ ID N0:1« To determine the 
position of the other T-DNA: plant DNA junction with 
respect to the pFl sequence, the T-DNA: plant DNA 

10 junction fragment was isolated. Genomic DNA from mutant 
line 3707, isolated as described previously, was 
partially digested by restriction enzyme Mbo I to give 
an average fragment size of ca. 15 kB. The fragment 
ends were partially-filled with dGTP and gATP by Klenow 

15 and cloned into Xho I half-sites of LambdaGEM®-ll 
(Promega Corporation) following the manufacturer's 
protocol. The phage library was titered and used 
essentially as described in Ausubel et al. [Current 
Protocols in Molecular Biology (1989) John Wiley & 

20 Sons] • The genomic phage library was screened with 
radiolabeled PCR product, ca. 0.6 kB, derived from 5' 
end of the gene in pFl. This product spans from 3 bp to 
the right of where the left-T-DNA border inserted to 15 
bp to the left of nucleotide position 1 in SEQ ID N0:1. 

25 Southern blot analysis of DNA from one of the purified, 
positively-hybridizing phages following Eco RI 
restriction digestion and electrophoresis showed that a 
4 kB Eco RI fragment hybridized to the 0.6 kB PCR 
product. The Eco RI fragment was siibcloned and subject 

30 to sequence analyses. Comparison of the sequences 

derived from this fragment, pFl and pSl showed that the 
insertion of T-DNA resulted in a 56 bp deletion at the 
site of insertion and that the T-DNA interrupted the 
Arabidopsis gene 711 bp 5' to the initiaton codon in SEQ 

35 ID N0:1. Thus, the T-DNA inserts 5" to the open reading 
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frame, consistent with the leaky eacpresssion of the gene 
encoding SEQ ID N0:1 and the leaky fatty acid phenotype 
in mutant 3707. While the .left T-DNArplant DNA junction 
is precise r that is without any sequence rearrangement 
5 in either the left T-DNA border or the flanking plant 
DNA, the other T-DNA:plant DNA junction is complex and 
not fully characterized. 

Plasmid pCF3 was deposited on Deceinber 3, 1991 with 
the American Type Culture Collection of Bockville, 
10 Maryland under the provisions of the Budapest Treaty and 
bears accession nuxober ATCC 68875. 

Using ftr^ah^rinnsis Delta--15 Desaturase cDNA as a 
Hybridization Probe to Isolate cDNAs Encoding 

B fi1at:ed np^sa^^ractAft -Frnni ArabidOPSis 

15 The 1.4 kb Not I insert fragment isolated from 

plasmid pCF3 was purified, radiolabeled/ and used to 
screen approximately 80,000 clones from the cDNA library 
made to poly A+ mRNA from 3-day-old etiolated 
Ayfth>irinpsis f-ha liana as described above, except that 

20 lower stringency hybridizations (1 M NaClr 50 rtM Tris- 
HCl, pH 7.5r 1% SDS, 5% dextran sulfate, 0.1 mg/m. 
denatured salmon sperm DNA and 50*^C) and washes 
(sequentially with 2X SSPE, 0.1% SDS at room teit?>erature 
for 5 min and then again with fresh solution for 10 rain, 

25 and finally with 0.5X SSPE, 0.1% SDS at 50*^C for 5 min.) 
were used. Approximately 17 strongly-hybridizing and 17 
weakly-hybridizing plaques were identified in the 
primary screen. Four of the weakly-hybridizing plaques 
were picked and subjected to one or two further rounds 

30 of screening with the radiolabeled probe as above xintil 
they were pure. To ensure that these were not delta-15 
desaturase clones, they were further analyzed to 
determine whether they hybridized to an 18 bp oligomer 
specific to the 3' non-coding region of delta-15 

35 desaturase cDNA (pCF3) . After autoradiography of the 
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filters, one of the clones was found not to hybridize to 
this probe. This clone was picked^ and a plasmid clone 
containing the cDNA insert was obtained as described 
above. Restriction analysis of this plasmid, designated 
5 pCM2, showed that it had an approximately 1.3 kb cDNA 
insert which lacked a 0.7 kb Nco I - Bgl II fragment 
characteristic of the Arabidopais delta-15 desaturase 
cDNA of pCF3. (This fragment corresponds to the DNA 
located between the Nco I site at nucleotides 474 to 479 

10 and the Bgl II site at nucleotides 1164 to 1169 in SEQ 
ID NO:l) . Partial nucleotide sequences of single 
strands from the 5* region and 3* region of pCM2 
revealed that the cDNA insert was incomplete and that it 
encoded a polypeptide that is similar to, but distinct 

15 from, that encoded by the cDNA in pCF3. In order to 
isolate a full-length version of the cDNA in plasmid 
pCM2, the 1.3 kB Not I fragment from plasmid pCM2 
containing the cDNA insert was isolated and used as a 
radiolabeled hybridization probe to rescreen the same 

20 Arabidopsis cDNA library as above- Three strongly 
hybridizing plaques were purified and the plasmids 
excised as described previously. The three resultant 
plasmids were digested by Not I restriction enzyme and 
shown to contain cDNA inserts ranging in size between 1 

25 kB and 1.5 JcB. Complete nucleotide sequence 

determination of the cDNA insert in one of these 
plasmids, designated pACF2-2, is shown in SEQ ID NO: 4. 
SEQ ID NO: 4 shows the 5' to 3* nucleotide sequence of 
base pairs of the Ai-aH-frfopsia tha liana cDNA which 

30 encodes a fatty acid desaturase. Nucleotides 10-12 and 
nucleotides 1358 to 1350 -are, respectively, the putative 
initiation codon and the termination codon of the open 
reading frame (nucleotides 10 to 1350) . The open 
reading frame was confirmed by conqparison of its deduced 

35 amino acid sequences with that of the related delta-15 
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fatty acid desaturase from soybean in this application. 
Nucleotides 1 to 9 and 1351 to 1525 are, respectively, 
the 5' and 3' \intranslated nucleotides. The 446 amino 
acid protein sequence in SEQ ID NO: 5 is that deduced 
5 from the open reading frame in SEQ ID NO: 4 and has an 
estimated molecular weight of 51 JcD. Alignment of SEQ 
ID N0S:2 and 5 shows an overall homology of 
approximately 80% and that the former has an 
approximately 55 amino acid long N-termixial extension, 

10 which is deduced to be a transit peptide found in 
nuclear-encoded plastid proteins. 

To analyse the developmental expression of the gene 
corresponding to SEQ ID NO: 4, this sequence was used as 
a radiolabeled hybridization probe on mRNA saniples from 

15 leaf, root, germinating seedling, and developing 
siliques from both wild type and mutant line 3707 
ft^hSriopsis plants, essentially as described in Maniatis 
et al. [Mblecular Cloning, A Laboratory Manual (1982) 
Cold Spring Harbor Laboratory Press] . The results 

20 indicated that, in contrast to the constitutive 

egression of the gene encoding SEQ ID NO;l, the mRNA 
corresponding to SEQ ID NO: 4 is abundant in green 
tissues, rare in roots and leaves, and is about three- 
fold more abundant in leaf than that of SEQ ID NOrl. 

25 The cDNA in plasmid pCM2 was also shown to hj^ridize 

polymorphically to genomic DNA from BrffMdopais thn liana 
(ecotype Wassileskija and marker line WlOO ecotype 
Landesberg background) digested with Eco Rl. It was 
used as a RPLP marker to map the genetic locus for the 

30 gene encoding this fatty acid desaturase in ftrahidQPS i S - 
A single genetic locus was positioned corresponding to 
this desaturase cDNA. Its location was thus determined 
to be on chromosome 3 between the laxnbda AT228 and 
cosmid C3838 RFLP markers, "north" of the glabrous locus 

35 (Chang et al., Proc. Natl. Acad. Sci. USA (1988) 
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85:6856-6860; Nam et al.. Plant Cell (1989) 1:699-705). 
This approximates the region to which ArabidQPSls fatty 
acid desaturase fad 2 . fad Dr and fad B mutations map 
[Somervllle et al., (1992) in press]. Unsuccessful 
5 efforts to clone the microsomal delta--12 fatty acid 
desaturase using cDNA inserts of pCF3 and pACF2-2 
alongwith the above data led Applicants to conclude that 
the cDNA in pACF2-2 encodes a plastld delta-15 fatty 
acid desaturase that corresponds to the fad D locus. 

10 This conclusion will be confirmed by biological 
expression of the cDNA in pACF2-2. 

Plasmld pCM2 was deposited on November 27, 1991 
with the American Type Culture Collection of Rockville, 
Maryland under the provisions of the Budapest Treaty and 

15 bears accession number ATCC 68852. 

The 1.4 kb, 1.3 kB, and 1.5 kB Not I cDNA Insert 
fragments isolated from plasmlds pCF3, pCM2 and pACF2-2 
were purified, radiolabeled^ and used several times to 
screen at low stringency as described above two 

20 different cDNA libraries: one was made to poly A+ mRNA 
from 3-day-old etiolated Ai-ahidopsis t.ha liana 
("etiolated" library) as described above and one made to 
polyA"** mRNA from the above-ground parts of ftrfabidopsis 
t- ha liana plants, which varied in size from those that 

25 had just opened their primary leaves to plants which had 
bolted and were flowering [Elledge et al. (1991) Proc. 
Natl. Acad Scl. USA 88:1731-17351. The cDNA Inserts in 
the library were made Into an Xho I site flanked by Eco 
RI sites in lambda Yes vector [Elledge et al. (1991) 

30 Proc. Natl. Acad Scl. USA 88:1731-1735] ("leaf" 

library) . Several plaques from both libraries that 
hybridized weakly and in duplicate lifts to both SEQ ID 
N0S:1 and 4 were subjected to plaque purification, 
Phagemids were excised from the pure phages from 

35 "etiolated" library as described above. Plasmlds were 
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excised from the purified phages of the "leaf" library 
by site-specific recombination using- the cre-lox 
recombination system in fi. coll strain BNN132 (Elledge 
et al. (1991) Proc. Natl. Acad Sci. USA 88:1731-1735]. 
5 In all cases, nucleotide sequencing of the cloned DNA 
revealed clones either identical to SEQ ID NOS:l or 4 or 
unrecognizedsle sequences . 

In another set of experiments ca. 400,000 phages in 
the "leaf library was screened with SEQ ID K0S:1 and 4 

10 at low stringency (26 C, 1 M Na+, 50% formamide) and 

high stringency (42 C, 1 M Na+, 50% formamide) . Of the 
several positive signals on the primary plaque lifts, 11 
showed high stringency hybridization to SEQ ID N0:1, 35 
showed high stringency hybridization to SEQ ID NO: 4, and 

15 39 hybridized to both at low stringency only. Twenty 

seven plaques of the low stringency signals came through 
a secondary low-stringency screen, 17 of which were used 
to make DNA from excised plasmids. Of the 7 plasmid DNA 
were sequenced, 8 were unrecognizable sequences, 5 were 

20 identical to SEQ ID N0:1, 2 were identical to SEQ ID 
NO: 2, and 2 were identical to one another and related 
but distinct to SEQ ID N0S:1 and 4. The novel 
desaturase sequence, designated pFad-x2, was also 
isolated from the "leaf" library independently by using 

25 as a hybridization probe a 0.6 JcB PGR product derived by 
polymerase chain reaction on poly A+ RNA wade from both 
canole seed as well as Arabidqpsis leaves, as described 
elsewhere in this application, using degenerate 
oligomers made to conserved sequences between plant 

30 delta-15 desaturases and the cyanobacterial A 

desaturase. The PCR-derived plasmid, designated pYacp7, 
was sequenced partially from both ends. Comparison of 
the sequences of pFad-x2 and pYacp7 revealed that the 
two independently cloned cDNAs contained an identical 

35 sequence that was related to the other deita-15 
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desaturases and that both were Incomplete cDNAs. A 
partial composite sequence derived from both plasmids, 
pFadx-2 and pYacp?^ is shown in SEQ ID NO: 16 as a 5' to 
3' nucleotide sequence of 472 bp. Nucleotides 2-4 and 
5 nucleotides 468 to 470 are, respectively, the first and 
the last codons in the open reading frame. This open 
reading frame is shown in SEQ ID NO: 17. Comparison of 
SEQ ID NO: 17 to the other delta-15 desaturase 
polypeptides disclosed in this application by the method 

10 of Needleman et al. [J. Mol. Biol. (1970) 48:443-453)] 

using gap weight and gap length weight values of 3.0 and 
0.1, respectively. The overall identities are between 
65% and 68% between SEQ ID NO: 17 and the microsomal 
delta-15 desaturases from Arabidopsis, canola and 

15 soybean and the overall identities are between 77% and 
87% between SEQ ID NO: 17 and the plastid delta-15 
desaturases from Arahidepsis. canola and soybean. In 
addition SEQ ID NO: 17 has an N-terminal peptide 
extension coxnpared to the microsomal delta-15 

20 desaturases that shows homology of the transit peptide 

sequence in Arabidopsis plastid delta-15 desaturase. On 
the basis of these comparisons it is deduced that SEQ ID 
NO: 16 encodes a plastid delta-15 desaturase. There is 
genetic data in Arahidopsis suggesting the presence of 

25 two loci for plastid delta-15 desaturase. The full- 
length version of SEQ ID NO: 16 can be readily isolated 
by one skilled in the art. The biological effect of 
introducdLng SEQ ID NO: 16 or its full-length version into 
plants will be used to confirm its identity. 

30 Plasmid pYacp7 was deposited on 20 November 1992 

with the American Type Culture Collection of Rockville, 
Maryland under the provisions of the Budapest Treaty and 
bears accession number ATCC 69129. 
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Using AT-ahiirfQPfils Delta*15 Desaturase cDNAs 
as Hybridization Probes to Isolate 
Delta-lS nAfiat^urafiA ePN^ fi -Fr^nm 0<-Iip-t' Plant Species 
For the purpose of cloning the Brassica napus seed 
5 cDNAs encoding delta-15 fatty acid desaturases^ the cDNA 
inserts from pCF3 and pCM2 were isolated by polymerase 
chain reaction from the respective plasmids, 
radiolabeled, and used as hybridization probes to screen 
a lambda phage cDNA library made with poly A+ xrtRNA from 

10 developing T^i^agg-iga napus seeds 20-21 days after 

polldLnation- This cDNA library was screened several 
times at low stringency, using the Bra]7id0PSis cDNA 
probes mentioned above . One of the ffrflRfi,*^ica napus 
CDNAs obtained in the initial screens was used as probe 

15 in a subsequent high stringency screen. 

ftygrV^Trinpsis pCM2 insert was radiolabeled and used 
as probe to screen approximately 300,000 plaques under 
low stringency hybridization conditions. The filter 
hybridizations were performed in 50 roM Tris pH 7.6,. 6X 

20 SSC, 5X Denhardfs, 0.5% SDS, 100 ug denatured calf 

thymus DNA at 50*^C overnight, and the posthybridization 
washes were carried out in 6X SSC, 0.5% SDS at room 
temperature for 15 min, then repeated with 2X SSC, 0.5% 
SDS at 45*'C for 30- min, and then repeated twice with 

25 0.2X SSC, 0.5% SDS at 50^C for 30 min. Five strongly- 
hybridizing phages were obtained. These were plaque 
purified and used to excise the phagemids as described 
in the manual of the pBluescriptll Phagemid Kit from 
Stratagene (Stratagene 1991 catalogue, item 212205) * 

30 One of these, designated pBNSF3-2, contained a 1.3 kb 
insert. pBNSF3-f2 was sequenced con?)letely on both 
strands and the nucleotide sequence is shown in SEQ ID 
NO: 6. Plasmid pBNSF3-2 was deposited on 27 November 
1991 with the American Type Culture Collection of 
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Rockvllle Maryland, USA under the provisions of the 
Budapest Treaty and bears the accession number 68854. 

An additional low stringency screen using pCM2 
probe provided eight strongly hybridizing phages. One 
5 of these, designated pBNSFd 8, contained a 0.4kb insert. 
pBNSFd-8 was sequenced completely on one strand, this 
nucleotide sequence showed significant divergence from 
the sequence SEQ ID NO: 6 in the homologous region, which 
suggested that it corresponded to a novel Brassica napus 

10 seed desaturase different from that shown in SEQ ID 
NO: 6. pBNSFd-8 insert was radiolabelled and used as 
hybridization probe in a high stringency screen of the 
Brassica napus seed cDNA library. The hybridization 
conditions were identical to those of the low stringency 

15 screen described above except for the temperature of the 
final two 30 min posthybridization washes in 0.2x SSC, 
0.5% SDS was increased to 60®C. This screen resulted in 
three strongly hybridizing phages that were purified and 
excised. One of the excised plasmids pBNSFd-3 contained 

20 a 1.4kb insert that was sequenced completely on both 
strands. SEQ ID NO: 8 shows the complete nucleotide 
sequence of pBNSFd-2. 

Using A^^ahidnpsis Delta- 15 Desaturase cDNA as a 
Hybridization Probe to Isolate a Glycerolipid 

25 n<:>fiatriiraff^ nPNA frnm Soybean 

A CDNA library was made to poly A+ roRNA isolated 
from developing soybean seeds, and screened essentially 
as described above, except that filters were 
prehybridized in 25 mL of hybridization buffer 

30 consisting of 50roM Tris-HCl, pH 7.5, 1 M NaCl, 1% SDS, 
5% dextran sulfate and 0-. 1 mg/mL denatured salmon sperm 
DNA (Sigma Chemical Co,) at 50^*0 for 2 h. Radiolabeled 
probe prepared from pCF3 as described above was added, 
and allowed to hybridize for 18 h at 50**C. The probes 

35 were washed twice at room temperature with 2X SSPE, 1% 
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SDS for five min followed by washing for 5 min at sa'C 
in 0.2X SSPE, 1% SDS. Autoradiography of the filt rs 
indicated that there was one strongly hybridizing 
plaque, and approximately five weakly hybridizing 
5 plaques. The more strongly hybridizing plaque was 
subjected to a second round of screening as before, 
except that the final wash was for 5 min at 60**C in 0.2X 
SSPE, 1% SDS. Numerous, strongly hybridizing plaques 
were observed, and one, well-isolated from other phage, 

10 was picked for further analysis. 

Sequences of the pBluescript vector from the 
purified phage, including the cDNA insert, were excised 
in the presence of a helper phage and the resultant 
phagemid was used to infect £. soli. XL-1 Blue cells. 

15 DNA from the plasmid, designated pXFl, was made by the 
alkaline lysis miniprep procedure described in Sambrook 
et al. (Molecular Cloning, A Laboratory Manual, 2nd ed. 
{198&) Cold Spring Harbor Laboratory Press) . The 
alkali-denatured double-stranded DNA from pXFl was 

20 completely sequenced on both strands. The insert of 
pXFl contained a stretch of 1783 nucleotides which 
contained an unknown open-reading frame and also 
contained a poly-A stretch of 16 nucleotides 3* to the 
open reading frame, from nucleotides 1767 to 1783, 

25 followed by an Eco RI restriction site. The 2184 bases 
that followed this Eco RI site contained a 1145 bp open 
reading frame which encoded a polypeptide of about 68% 
identity to, and colinear with, the ftrffMtfOPala delta-15 
desaturase polypeptide listed in SEQ ID Noi2. The 

30 putative start methionine of the 1145 bp open-reading 
frame corresponded to the start methionine of the 
i.^,K.Hor....ts microsomal delta-15 peptide and there were 
no amino acids corresponding to a plastid transit 
peptide 5' to this methionine. When the insert in pXFl 

35 was digested with Eco RI four fragments were observed. 
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fragments of approximately 370 bp and 1400 bp fragments, 
derived from the first 1783 bp of the insert in pXFl, 
and fragments of approximately 600 bp and 1600 bp 
derived from the the other 2184 nucleotides of the 
5 insert in pXFl, Only the 600 bp and 1600 bp fragments 
hybridized with probe derived from pCF3 on Southern 
blots* It was deduced that pXFl contained two different 
cDNA inserts separated by an Eco RI site and the second 
of these inserts was a 2184 bp cDKA encoding a soybean 

10 microsomal delta-15 desaturase. The complete nucleotide 
sequence of the 2184 bp soybean microsomal delta-15 cDNA 
contained in plasmid pXFl is listed in SEQ ID No: 10. 
Plasmid pXFl was deposited on December 3, 1991 with the 
American Type Culture Collection of Rockville, Maryland 

15 under the provisions of the Budapest Treaty and bears 
accession number ATCC 68874 * 
Using Soybean Microsomal Delta-15 Desaturase cDNA as a 
Hybridization Probe to Isolate 
riPWAs F.nffndiri^r Relat^ed Desaturafi^g from Soybean 

20 A 1-0 kb fragment of DNA corresponding to part of 

the coding region of the. soybean microsomal delta-15 
desaturase cDNA contained in plasmid pXFl, was excised 
with the restriction enzyme Hha I and gel purified. The 
fragment was labeled with 32p as described above and 

25 used to probe a soybean cDNA library as described above. 
Autoradiography of the filters indicated that there were 
eight hybridizing plaques and these were subjected to a 
second round of screening. Sequences of the pBluescript 
vector from all eight of the purified phages, including 

30 the cDNA inserts, were excised in the presence of a 
helper phage and the resultant phagemids were used to 
infect £• coli XL-1 Blue cells. DNA from the plasmids 
was made by the alkaline lysis miniprep procedure 
described in Sambrook et al. (Molecular Cloning, A 

35 Laboratory Manual, 2nd ed. (1989) Cold Spring Harbor 
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Laboratory Press) . Restriction analysis showed tbey 
contained inserts ranging from 1.0 Icb to 3.0 in size. 
One of these inserts, designated pSFD-118bwp, contained 
an insert of about 1700 bp. The alkali-denatured 
5 double-stranded DNA from pSFD-118bwp was completely 
sequenced on both strands, shown in SEQ ID NO: 12. The 
insert of pSFD-llSbwp contained a stretch of 1675 
nucleotides which contained an open-reading frame 
encoding a polypeptide, shown in SEQ ID NO: 13, of about 

10 80% identity with, and colinear with, the ftrnl>idPP8iS 

plastid delta-15 desaturase polypeptide listed in SEQ ID 
NoiS. The open-reading frame also encoded amino acids 
corresponding to a plastid transit peptide at the 5' end 
of the open-reading frame. The transit peptide was 

15 colinear with, and shared some homology to, the transit 
peptide described for the J\rnhi finpsis plastid delta-15 
glycerolipid desaturase. The complete nucleotide 
sequence of the 1675 bp soybean plastid delta-15 
glycerolipid desaturase cDNA is listed in SEQ ID No: 12. 

20 Comparison of the different delta-15 desaturase 

sequences disclosed in the application by the method of 
Needleman et al. (J. Mol. Biol. (1970) 48:443-453) using gap 
weight and gap length weight values of 3.0 and 0.1, 
respectively, reveals the relatedness between them as shown 

25 in Table 3. 
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TftBLE 3 
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a3, ad/ c3, cD, s3 and sD refer, respectively, to 
SEQ ID NO: 2 ^Arahidonsis microsomal delta-15 
desaturase)/ SEQ ID N0:5 (Arf^]:^Adgpsis plastid delta-15 
5 desaturase) f SEQ ID N0:7 (canola microsomal delta-15 
desaturase), SEQ ID NO: 9 (canola plastid delta-15 
desaturase), SEQ ID NO: 11 (soybean microsomal delta-15 
desaturase) / and SEQ ID NO: 13 (soybean plastid delta-15 
desaturase) . Based on these comparisons, the delta-15 

10 desaturases, of both microsomal and plastid types / have 
overall identities of 65% or more at the amino acid 
levels, even when from different plant species. 

Isolation of Nucleotide Sequences Encoding 
HnmQlQcrQUS and Het:ero1 ncrniis Gl vrerolini d Desaturases 

15 Fragments of the instant invention may be used to 

isolate cDNAs and genes of homologous and heterologous 
glycerolipid desaturases from the same species as the 
fragment of the invention or from different species. 
Isolation of homologous genes using sequence-dependent 

20 protocols is well-known in the art. Southern blot 
analysis revealed that the Arabidopsis microsomal 
delta-15 desaturase cDNA (SEQ ID N0:1) hybridized to 
genomic DNA fragments of .corn and soybean. In addition. 
Applicants have demonstrated that it can be used to 

25 isolate cDNAs encoding seed microsomal delta-15 

desaturases from Braasica papus (SEQ ID NO: 6) and 
soybean (SEQ ID NO: 10) • Thus, one can isolate cDNAs and 
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genes for homologous glycerolipid diesaturases from the 
same or different higher plant species, especially from 
the oil-producing species. 

htare iroportantly, one can use the fragments of the 
5 invention to isolate cDNAs and genes for heterologous 
glycerolipid desaturases, including those found in 
plastids. Thus, Arabidopsis microsomal delta-15 
desaturase cDNA (SEQ ID N0:1) was successfully used as a 
hybridization probe to isolate cDNAs encoding the 

10 related plastld delta-15 desaturases from Arabidopsis 

(SEQ ID NO:4) and T»i->m«lea nSPUS (SEQ ID NO? 8), and the 
soybean microsomal delta-15 soybean (SEQ ID NO: 10) was 
successfully used to isolate soybean cDNA encoding 
plastid delta-15 desaturase (SEQ ID NO: 12) . 

15 In a particular embodiment of the present 

invention, regions of the nucleic acid fragments of the 
invention that are conserved between different 
desaturases may be used by one skilled in the art to 
design a mixture of degenerate oligomers for use in 

20 sequence-dependent protocols aimed at isolating nucleic 
add fragments encoding other homologous or heterologous 
glycerolipid desaturase CDNA's or genes. For exanqple, 
by comparing all desaturase polypeptides one can 
identify stretches of amino acids that are conserved 

25 between them, and then use the conserved amino acid 

sequence to design oligomers, both short degenerate or 
long ones, or "guessmers" as known by one sklHed in the 
art (see Sambrook et al., (Ifelecular Cloning, A 
Laboratory Manual, 2nd ed. (1989) , Cold Spring Harbor 

30 Laboratory Press) . Such oligomers and" "quessmers" may 
be used as hybridization -probes as kno%m to one skilled 
in the art. 

For exan?>le, comparison of cyanobacterial desA and 
plant delta-15 desaturases revealed a particularly well 
35 conserved stretch of amino acids (amino acids 97-108 in 
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SEQ ID N0:1). SEQ ID NOS:20 and 21 represent two sets 
of 36-mers each 16-fold degenerate made to this r gion. 
End- labeled oligomers represented in SEQ ID NOS:20 and 
21 were mixed and used as hybridzation probes to screen 
5 Arabidopsis cDNA libraries. Most of the positively- 
hybridizing plaques also hybridized to cDNAs encoding 
Arabidopsis microsomal and plastid delta-15 desaturases 
(SEQ ID N0S:1 and 4) . However, the use of SEQ ID NOS:20 
and 21 did not give consistent and reproducible results. 

10 A 135 base-long oligomer (SEQ ID NO: 32) was also made as 
an antisense strand to a longer stretch of the same 
conserved region, amino acids 97 to 141 in SEQ ID N0:1 
(FVLGHDCGHGSFSDIPLLNSWGHILHSFILVPYHGWRISHRTHH) . At 
positions of ambiguity, the design used either 

15 deoxyinosines or most frequently used codons based on 
the codon usage in Arabidopsis genes. When used as a 
hybridization probe, the 135-mer hybridized to all 
plaques that also hybridized to cDNAs encoding 
Arabidopsis microsomal and plastid delta-15 desaturases 

20 (SEQ ID N0S:1 and 4). In addition, it also hybridized 
to plaques that did not hybridize to SEQ ID NOS:l and 
4) . The latter were purified and excised as described 
previously. Nucleotide sequencing of the cDNA inserts 
in the resultant plasraids revealed DNA sequences that 

25 did not show any relatedness to any desaturase. 

For another example, in the polymerase chain 
reaction (Innis, et al., Eds, (1990) PGR Protocols: A 
Guide to Methods and Applications, Academic Press, San 
Diego) , two short pieces of the present fragment of the 

30 invention can be used to amplify a longer glycerolipid 

desaturase DNA fragment from DNA or RNA. The polymerase 
chain reaction may also be performed on a library of 
cloned nucleotide sequences with one primer based on the 
fragment of the invention and the other on either the 

35 poly A+ tail or a vector sequence. These oligomers may 
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be unique sequences or degenerate sequences derived from 
the nucleic acid fragments of the invention. The longer 
piece of homologous glycerolipid desaturase DNA 
generated by this method could then be used as a probe 
for isolating related glycerolipid desaturase genes or 
cDNAs from a-raHidopsls or other species. The design of 
oligomers, including long oligomers using deoxyinoaine, 
and "guessmers" for hybridization or for the polymerase 
chain reaction are known to one skilled in the art and 
discussed in Sambrook et al., (Molecular Cloningr A 
Laboratory Manual, 2nd ed. (1989) , Cold Spring Harbor 
Laboratory Press) . Stretches of conserved amino acids 
between delta-15 desaturase and other desaturases, 
especially rfesA . allow for the design of such oligomers. 
15 For example, conserved stretches of amino acids between 
rip>sA and delta-15 desaturase, discussed above, are 
useful in designing long oligomers for hybridization as 
well as shorter ones for use as primers in the 
polymerase chain reaction. In this regard, the 
20 conserved amino acid stretch of amino acids 91 to 108 of 
SEQ ID NO: 2 is particularly useful. Other conserved 
regions in SEQ ID NO: 2 useful for this purpose are amino 
acids 299 to 309, amino acids 115 to 121, and amino 
acids 133 to 141. Amino acid stretch 133 to 141 in SEQ 
25 ID NO: 2 shows especially good homology to several 

desaturases. For example, in this stretch, amino acids 
133, 137, 138, 140 and 141 arc conserved in plant 
delta-15 desaturases, cyanobacterial dfiS&r yeast and 
manmallan microsomal stearoyl-CoA desaturases. 
30 Coa5>arison of cyanobacterial des A and plant delta-15 
desaturases revealed two -particularly well conserved 
stretch of amino acids (amino acids 97-108 and amino 
acids 299-311 in SEQ ID N0:1) that can be used for PCR.. 
The following sets of PCR primers were made to these 
35 regions : 
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SEQ 
TP WQ 


36 


Fold 


AA positions 
in 


AA Sequence 


20 


16 


97-108 


<S) 


FVLGKDCGHGSF 


21 


36 


16 


97—108 


(S) 


FVLGHDCGKGSF 


28 


36 


16 


AT ^ A D 

97-108 


(S) 


FVLGHDCGHGSF 


29 


36 


16 


cn 1 A o 

9 /— IUp 


CS) 


FVLGHDCGKGSF 


22 


18 


72 


100-105 


(S) 


GHDCGH 


23 


18 


72 


100-105 


(S) 


GHDCGH 


24 


18 


72 


299-304 


(AS) 


KDIGTH 


25 


18 


72 


299-304 


(AS) 


HDIGTH 


26 


23 


416 


304-309 


(AS) 


HVIHHL 


27 


23 


416 


304-309 


(AS) 


HVIHHL 


30 


38 


64 


299-311 


(AS) 


HDIGTHVZHHLFP 


31 


38 


64 


299-311 


(AS) 


HDZGTHVIKHLFP 



In one experiment, PCRs were performed using SEQ ID 
NOS:22 and 23 as sense primers and either SEQ ID NOS:24 
and 25 or SEQ ID NOS:26 and 27 as antisense primers on 
poly A+ RNA purified from both Arabidopsis leaf and 
5 canola developing seeds. All PCRs resulted in PCR 
products of the correct size (ca. 630 bp) • The PCR 
products from Arabidopsis and canola were purified and 
used as radiolabeled hybridization probes to screen the 
Lambda Yes Arabidopsis cDNA library, as described above. 

10 This led to the isolation of a pure phage, which was 
excised to give plasmid pYacp7 . The cDNA insert in 
pyacp7 was partially sequenced. It's sequence showed 
that it encoded an incomplete desaturase polypeptide 
that was identical to another cDNA (in plasmid pFadx-2) 

15 isolated by low- stringency hybridization as described 
previously. The composite sequence derived from the 
partial sequences from the cDNA inserts in pFadx-2 and 
pYacp7 is shown in SEQ ID NO: 16 and the polypeptide 
encoded by it in SEQ ID NO: 17. As discussed previously, 

20 SEQ ID NO: 17 is a putative plastid delta-15 desaturase. 
This is further supported by Southern blot analysis 
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using radiolabeled cDNA Inserts from either pCF3, 
pACF2-2, or pYacp7 on Arabidopsis genomic DNA digest d 
with one of several enzymes. It shows that the 
different inserts hybridize to different restriction 
5 fragments and that only the inserts from pACF2-2 and 
pYa<^7 show some cross-hybridization. 

In another PGR experiment, PCR was performed using 
ca. 80 pmoles each of SEQ ID N0Sr28 and 29 as sense 
primers and ca. 94 pmoles each of SEQ IP NOS:30 and 31 

10 as antisense primers on poly A+ RNA purified from 

Arabidopsis mutant line 3707. This was performed using 
GeneAmp® RNA VCR Kit (Perkin Elmer Cetus) following 
manufacturer's protocol and using the following program: 
a) 1 cycle of 2 min at 95*C, b) 35 cycles of 1 min at 

IS 95**C (denaturation) , 1 min at 50*0 (annealing) and 1 min 
at SS'C (extension), and c) 1 cycle of 7 min at 65*0. 
The resulting PCR product, of the correct size (ca. 630 
bp) , was purified, radiolabeled, and used as a 
hybridization probe on a Southern blot of Arabidopsis 

20 genomic DNA as described above. While it hybridized to 
restriction fragments that also hybridized to SEQ ID 
NOS:l (Arabidc^sis microsomal deita-15 desaturase) , 4 
(Arabidopsis plastid delta-15 desaturase) , and 16 
(Arabidopsis plastid delta-15 desaturase) , it also 

25 hybridized to novel fragments that did not hybridze to 
previously cloned desaturase cDNAs. However, even after 
several attentpts, the radiolabeled PCR product did not 
hybridize to any novel cDNA clone when used as a probe 
on different Arabidopsis CDNA libraries? in all cases 

30 it hybridzed only to plaques that also hybridized to the 
known desaturase cDNAs. -Furthermore, the PCR product 
was subcloned into a plasmid vector and after screening 
about a 100 of these, none gave rise to a clone with a 
novel desaturase sequence. 
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The isolation of other glycerolipid d saturases 
will become easier as more examples of glycerolipid 
desaturases are isolated using the fragments of the 
invention. Knowing the conserved amino acid sequences 
from diverse desaturases will also allow one to identify 
more and better consensus sequences. Such sequences can 
be used to malce hybridization probes or amplification 
primers which will further aid in the isolation of 
different glycerolipid desaturases, including those from 
non-plant sources such as fungi, algae, and even 
cyanobacteria, as well as other membrane-associated 
desaturases from other organisms. 

The function of the diverse nucleotide fragments 
encoding glycerolipid desaturases that can be isolated 
15 using the present invention can be identified by 
transforming plants with the isolated desaturase 
sequences, linked in sense or antisense orientation to 
suitable regulatory sequences required for plant 
expression, and observing the fatty acid phenotype of 
20 the resulting transgenic plants. Preferred target 

plants for the transformation are the same as the source 
of the isolated nucleotide fragments when the goal is to 
obtain inhibition of the corresponding endogenous gene 
by antisense inhibition or cosuppression. Preferred 
25 target plants for use in expression or overexpression of 
the isolated nucleic acid fragments are plants with 
known mutations in desaturation reactions, such as the 
^^»K4ri»t^«tfi desaturase mutants, mutant flax deficient in 
delta-15 desaturation, or mutant sunflower deficient in 
delta-12 desaturation. Alternatively, the function of 
the isolated nucleic acid fragments can be determined 
similarly via transformation of other organisms, such as 
yeast or cyanobacteria, with chimeric genes containing 
the nucleic acid fragment and suitable regulatory 
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seq[uences followed by analysis of fatty acid con^osltlon 
and/ or enzyme activity. 

Overexpression of tlie Glycerolipid 
Desaturase Enzymes in Transgenic Species 

5 The nucleic acid fragment (s) of the instant 

dLnvention encoding functional glycerolipid 
desaturase (s) / with suitable regulatory sequences, can 
be used to overexpress the enzyme (s) in transgenic 
organisms. Such recombinant DNA constructs may include 
10 either the native glycerolipid desaturase gene or a 

chimeric glycerolipid desaturase gene isolated from the 
same or a different species as the host organism. For 
overexpression of glycerolipid desaturase (s) , it is 
prefersJDle that the introduced gene be from a different 
15 species to reduce the likelihood of cosuppression. For 
exaznpler overexpression of deltar-15 desaturase in 
soybean, rapeseed, or other oil-producing species to 
produce altered levels of polyunsaturated fatty acids 
may be achieved by e3q>ressing RNA from the entire cDNA 
20 found in pCF3. Similarly, the isolated nucleic acid 
fragments encoding glycerolipid desaturases from 
Arabidopsis , rapeseed, and soybean can also be used by 
one skilled in the art to obtain substantially 
homologous full-length cDNAs, if not already obtained, 
25 as well as the corresponding genes as fragments of the 
invention. These, in turn, may be used to overexpress 
the corresponding desaturases in plants. One skilled in 
the art can also isolate the coding sequence (s) from the 
fragment (s) of the invention by using and/or creating 
30 sites for restriction endonucleases, as described in 
Sambrook et al., (Molecular Cloning, A Laboratory 
Manual, 2nd ed. (1989)/ Cold Spring Harbor Laboratory 
Press) . For example, the fragment in SEQ ID NOrl in 
plasmld pCF3 is f lamked by Not Z sites and can be 
35 isolated as a Not I fragment that can be introduced in 
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the sense orientation relative to suitable plant 
regulatory s guences. Alternatively, sites for Nco I 
(5'-CCATGG-3M or Sph I ( 5 ' -GCATGC--3 ' ) that allow 
precise removal of coding sequences starting with the 
5 initiating codon "ATG" may be engineered into the 
fragment (s) of the invention. For example, for 
utilizing the coding sequence of delta-15 desaturase 
from pCF3r an Sph I site can be engineered by 
substituting nucleotides at positions 44, 45, and 49 of 

10 SEQ ID NO:l with G, C, and respectively. 

Inhibition of Plant Target 
Genes by Use of Ant i sense RNA 
Antisense RNA has been used to inhibit plant target 
genes in a tissue-specific manner (see van der Krol et 

15 al., Biotechniques (1988) 6:958-976)- Antisense 

inhibition has been shown using the entire cDNA sequence 
(Sheehy et al., Proc. Natl. Acad. Sci. USA (1988) 
85:8805-8809) as well as a partial cDNA sequence (Cannon 
et al.. Plant Molec. Biol. (1990) 15:39-47). There is 

20 also evidence that the 3* non-coding sequences (Ch'ng 
et al., Proc. Natl. Acad. Sci. USA (1989) 
86:10006-10010) and fragments of 5' coding sequence, 
containing as few as 41 base-pairs of a 1.87 kb cDNA 
(Cannon et al.. Plant Molec. Biol. (1990) 15:39-47), can 

25 play important roles in antisense inhibition. 

The use of antisense inhibition of the glycerolipid 
desaturases may require isolation of the transcribed 
sequence for one or more target glycerolipid desaturase 
genes that are expressed in the target tissue of the 

30 target plant. The genes that are most highly expressed 
are the best targets for- antisense inhibition. These 
genes may be identified by determining their levels of 
transcription by techniques, such as quantitative 
analysis of mRNA levels or nuclear run-off 

35 transcription, known to one skilled in the art. 
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For example/ antisense inhibition of delta-15 
desaHurase in Ryagslca napus resulting in alter d levels 
of polyunsaturated fatty acids may be acbieved by 
expressing antisense RNA from the entire or partial cDNA 
5 found in pMISF3-2. 

Inhibition of Plant Target 
(SoT^Qg lr^^r Cofiuopression 
The phenomenon of cosuppression has also been used 
to inhibit plant target genes in a tissue-specific 
10 manner. Cosuppression of an endogenous gene using the 
entire cDNA sequence (Napoli et al.. The Plant Cell 
(1990) 2:279-289; van der Krol et al./ The Plant Cell 
(1990) 2:291-299) as well as a partial cDNA sequence 
(730 bp of a 1770 bp cDNA) (Smith et al. , Mol* Gen. 
15 Genetics (1990) 224:477-481) are known. 

The nucleic acid fragments of the instant invention 
encoding glycerolipid desaturases, or parts thereof/ 
with suitable regulatory sequences r can be used to 
reduce the level of glycerolipid desaturases, thereby 
20 altering fatty acid composition, in transgenic plants 
which contain an endogenous gene substantially 
homologous to the introduced nucleic acid fragment. The 
experimental procedures necessary for this are similar 
to those described above for the overexpression of the 
25 glycerolipid desaturase nucleic acid fragments. For 
example, cosuppression of delta-15 desaturase in 
T^T^aggica napus resulting in altered levels of 
polyunsaturated fatty acids may be achieved by 
eacpressing in the sense orientation the entire or 
30 partial seed delta-15 desaturase cDNA found in pBNSF3-2. 
-Q^iftf^Men n-F hqsi- *:^ Pyomoi^f^rs and Enhancers 
A preferred class of heterologous hosts for the 
expression of the nucleic acid fragments of the 
invention are eukaryotic hosts, particularly the cells 
35 of higher plants. Particularly preferred among the 
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higher plants are the oil-producing species, such as 
soybean f ciyeine max ) , rapeseed (including Bragsica 
napus ^ a« ffaw^pesi^gis ^ , sunflower (Helisnthus aimua) , 
cotton r fiQfifivnium hixfiutum) / corn (Zs^ mays) , cocoa 

5 f Th^Qbroma eacao ^ , saf flower (CagthamUS tinCtOriUS) / oil 

palm r Eia&is yiin^ensis^ , coconut palm (CQCQg nucifera) # 
flax fi^lnum ivQ^t-at-issimum^ , and peanut (Arachis 

hypQgaea) • 

Esq>r'ession in plants will use regulatory sequences 

10 functional in such plants. The expression of foreign 
genes in plants is well-established (De Blaere et al., 
Meth, Enzymol. (1987) 153:277-291). The source of the 
promoter chosen to drive the expression of the fragments 
of the invention is not critical provided it has 

15 sufficient transcriptional activity to accomplish the 

invention by increasing or decreasing^ respectively, the 
level of translatable mRNA for the glycerolipid 
desaturases in the desired host tissue. Preferred 
promoters include (a) strong constitutive plant 

20 promoters, such as those directing the 19S and 35S 

transcripts in cauliflower mosaic virus (Odell et al.. 
Nature (1985) 313:810-812; Hull et al.. Virology (1987) 
86:482-493) r and (b) tissue- or developmentally-specif ic 
promoters. Examples of tissue-specific promoters are 

25 the light-inducible promoter of the small subunit of 

ribulose 1 r 5-bis-phosphate carboxylase (if expression is 
desired in photosynthetic tissues)/ the maize zein 
protein prc«noter (Matzke et al., EMBO J. (1984) 
3:1525-1532) r and the chlorophyll a/B binding protein 

30 promoter (Lampa et al.. Nature (1986) 316:750-752). 

Particularly preferred promoters are those that 
allow seed-specific expression. This may be especially 
useful since seeds are the primary source of vegetable 
oils and also since seed-specific expression will avoid 

35 any potential deleterious effect in non-seed tissues. 
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Exasiples of s ed-specific promotexs include, but are not 
lixnlted to, the promoters of seed storage prot ins, 
which can represent up to 90% of total seed protein in 
many plants. The seed storage proteins are strictly 
5 regulated, being expressed almost exclusively in seeds 
in a highly tissue-specific and stage-specific manner 
(Higgins et al., Ann. Rev. Plant Physiol. (1984) 
35:191-221; Goldberg et al.. Cell (1989) 56:149-160). 
Moreover, different seed storage proteins may be 

10 expressed at different stages of seed development. 

Eatress ion of seed-specific genes has been studied 
in great detail (See reviews by Goldberg et al.. Cell 
(1989) 56:149-160 and Higgins et al., Ann. Rev. Plant 
Physiol. (1984) 35:191-221). There are currently 

15 numerous exaxnples of seed-specific expression of seed 
storage protein genes in transgenic dicotyledonous 
plants. These include genes from dicotyledonous plants 
for bean b-phaseolin (Sengupta-Gbpalan et al., Proc. 
Natl. Acad. Sci. USA (1985) 82:3320-3324:/ Hoffman et 

20 al.. Plant Mol. Biol. (1988) 11:717-729), bean lectin 
(Voelker et al., EMBO J. (1987) 6:3571-3577), soybean 
lectin (Okamuro et al., Proc. Natl.^ Acad. Sci. USA 
(1986) 83:8240-8244), soybean Kunitz trypsin inhibitor 
(Perez-Grau et al.. Plant Cell (1989) 1:095-1109) 

25 soybean b-conglycinin (Beachy et al., BMBO J. (1985) 
4:3047-3053; pea vicilin (Higgins et al.. Plant Mol. 
Biol- (1988) 11:683-695), pea convicilin (Newbigln et 
al., Planta (1990) 180:461-470), pea legumin (Shirsat et 
al., Mol. Gen. Genetics (1989) 215:326-331); rapeseed 

30 napin (Radke et al., Theor. Appl. Genet. (1988) 

75:685-694) as well as genes from monocotyledonous 
plants such as for maize 15 kD zein (Hoffman et al., 
EMBO J. (1987) 6:3213-322li , maize 18 kD oleosin (Lee at 
al., Proc. Natl. Acad. Sci. OSA (1991) 888:6181-6185), 

35 barley b-hordein (Marris et al.. Plant Mol, Biol. (1988) 
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10:359-366) and wheat glutenin (Colot et al., EMBO J. 
(1987) 6:3559-3564). Moreover, promoters of seed- 
specific genes operably linked to heterologous coding 
sequences in chimeric gene constructs also maintain 
5 their temporal and spatial expression pattern in 
transgenic plants • Such examples include use of 
Ai-abidonsis t- ha liana 2S seed Storage protein gene 
promoter to express enkephalin peptides in Arabldopsis 
and B. na pua seeds (Vandekerckhove et al,, 

10 Bio/Technology (1989) 7:929-932), bean lectin and bean 
b-phaseolin promoters to express luciferase (Riggs et 
al.r Plant Sci- (1989) 63:47-57), and wheat glutenin 
promoters to express chloramphenicol acetyl transferase 
(Colot et al., EMBO J. (1987) 6:3559-3564). 

15 Of particular use in the expression of the nucleic 

acid fragment of the invention will be the heterologous 
promoters from several soybean seed storage protein 
genes such as those for the Kunitz trypsin inhibitor 
(Jofuku et al.^ Plant Cell (1989) 1:1079-1093; glycinin 

20 (Nielsen et al.. Plant Cell (1989) 1:313-328), and 
b-conglycinin (Harada et al.. Plant Cell (1989) 
1:415-425). Promoters of genes for a- and b-svibunits of 
soybean p-conglycinin storage protein will be 
particularly useful in expressing the mRNA or the 

25 antisense BNA in the cotyledons at mid- to late-stages 
of seed development (Beachy et al., EMBO J. (1985) 
4:3047-3053) in transgenic plants. This is because 
there is very little position effect on their expression 
in transgenic seeds, and the two promoters show 

30 different temporal regulation. The promoter for the 

a-subunit gene is expressed a few days before that for 
the b-subunit gene. This is important for transforming 
rapeseed where oil biosynthesis begins about a week 
before seed storage protein synthesis (Murphy et al., J. 

35 Plant Physiol. (1989) 135:63-69). 
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Also of particular use will toe promoters of ^[enes 
eaqpressed during early enbryogenesls and oil 
biosynthesis. The native regulatory sequences, 
including the native promoters, of the glycerolipid 
5 desaturase genes expressing the nucleic acid fragments 
of the invention can be used following their isolation 
by those skilled in the art. Heterologous promoters 
from other genes involved in seed oil biosynthesis, such 
as those for &. napus isocitrate lyase and malate 

10 synthase (Comai et al.. Plant Cell (1989) 1:293-300), 

delta-9 desaturase from saf flower (Thon^son et al. Proc. 
Natl. Acad. Sci. USA (1991) 88:2578-2582) and castor 
(Shanklin et al., Proc. Natl. Acad. Sci. USA (1991) 
88:2510-2514), acyl carrier protein (ACP) from 

15 ar•a^>^rinps^s (Post-Beittenmiller et al., Nucl. Acids Res- 
(1989) 17:1777), fi. nsBUa (Safford et al., Eur. J. 
Biochem. (1988) 174:287-295) , and R. mmpftnf.ris (Stose et 
al., Nucl. Acids Res. (1987) 15:7197), b-Jcatoacyl-ACP 
synthetase from barley (Siggaard-Anderaen et al.r Proc. 

20 Natl. Acad. Sci. USA (1991) 88:4114-4118), and oleosin 
f rom JSfia-jnaxa (Lee et al., Proc. Natl. Acad. Sci. USA 
(1991) 88:6181-6185), soybean (Genbank Accession No: 
X60773) and fi. napus (Lee et al., Plaiit Physiol. (1991) 
96:1395-1397) will be of use. If the sequence of the 

25 corresponding genes is not disclosed or their promoter 
region is not identified, one skilled in the art can use 
the published sequence to isoiate the corresponding gene 
and a fragment thereof containing the promoter. The 
partial protein sequences for the relatively-abundant 

30 enoyl-ACP reductase and acetyl-CoA carboxylase are also 
published (Slabas et al.v Biochim. Biophys. Acta (1987) 
877:271-280; Cottingham et al., Biochim. Biophys. Acta 
(1988) 954:201-207) and one skilled in the art can use 
these sequences to isolate the corresponding seed genes 

35 with their promoters. Similarly, the fragments of the 
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present Invention encoding glycerollpld desaturases can 
be used to obtain promoter regions of the corresponding 
genes for use In expressing chimeric genes. 

Attaining the proper level of expression of the 
5 nucleic acid fragments of the Invention may require the 
use of different chimeric genes utilizing different 
promoters. Such chimeric genes can be transferred Into 
host plants either together In a single expression 
vector or seqnientlally using more than one vector. 

10 It Is envisioned that the Introduction of enhancers 

or enhancer-like elements Into the promoter regions of 
either the native or chimeric nucleic acid fragments of 
the Invention will result In Increased expression to 
accomplish the Invention. This would Include viral 

15 enhancers such as that found In the 35S promoter (Odell 
et al.. Plant Mol. Biol. (1988) 10:263-272), enhancers 
from the opine genes (Fromm et al./ Plant Cell (1989) 
1:977-984)/ or enhancers from any other source that 
result In Increased transcription when placed Into a 

20 promoter operably linked to the nucleic acid fragment of 
the Invention. 

Of particular dlmportance Is the DNA sequence 
element Isolated from the gene for the a-subunlt of 
b-conglyclnln that can confer 40- fold seed-specific 

25 enhancement to a constitutive promoter (Chen et al./ 

Dev. Genet. (1989) 10:112-122). One skilled In the art 
can readily isolate this element and Insert It within 
the promoter region of any gene In order to obtain seed- 
specific enhanced expression with the promoter in 

30 transgenic plants. Insertion of such an element In any 
seed-specific gene that Is expressed at different times 
than the b-conglyclnln gene will result In expression In 
transgenic plants for a longer period during seed 
development . 
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The invention can also be accoxnplislied by a variety 
of other methods to obtain the desired end. In one 
fonur the invention is based on modifying plants to 
produce increased levels of glycerolipid desaturases by 
5 virtue of introducing more than one copy of the foreign 
gene containing the nucleic acid fragments of the 
invention. In some cases r the desired level of 
polyunsaturated fatty acids may require introduction of 
foreign genes for more than one kind of glycerolipid 

10 desaturase . 

Any 3" non-coding region capable of providing a 
polyadenylation signal and other regulatory sequences 
that may be required for the proper expression of the 
nucleic acid fragments of the invention can be used to 

15 accomplish the invention. This would include 3* ends of 
the native glycerolipid desaturase (s) , viral genes such 
as from the 35S or the 19S cauliflower mosaic virus 
transcripts, from the opine synthesis genes r ribulose 
1,5-bisphosphate carboxylase r or chlorophyll a/b binding 

20 protein. There are numerous examples in the art that 

teach the usefulness of different 3*^ non-coding regions. 

«PT>ang-Pftmal-inn Methods 
Various methods of transforming cells of higher 
plants according to the present invention are available 

25 to those skilled in the art Csee BPO Pub. 0 295 959 A2 
and 0 318 341 Al) • Such methods include those based on 
transformation vectors utilizing the Ti and Ri plasmids 
of i^r^r.>.^ni^^i^iv^ ssSi. It is particularly preferred to 
use the binary type of these vectors. Ti-derived 

30 vectors transform a wide variety of higher plants, 

including monocotyledonous and dicotyledonous plants 
(Sukhapinda et al.. Plant Mol. Biol. (1987) 8:209-216; 
Potrykus, Mol. Gen. Genet. (1985) 199:183). Other 
transformation methods are available to those skilled in 

35 the art, such as direct uptake of foreign DNA constructs 
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(see EPO Pub. 0 295 959 A2) , techniques of 
electroporation (Fromm et al.. Nature (1986) (London) 
319:791) or high-velocity ballistic bombardment with 
metal particles coated with the nucleic acid constructs 
5 (Kline et al.. Nature (1987) (London) 327:70). Once 
transformed, the cells can be regenerated by those 
skilled in the art. 

Of particular relevance are the recently described 
methods to transform foreign genes into coxranercially 
10 important crops, such as rapeseed (De Block et al.. 

Plant Physiol- (1989) 91:694-701), sunflower (Everett 
et al., Bio/Technology (1987) 5:1201), and soybean 
(Christou et al., Proc. Natl. Acad. Sci USA (1989) 
86:7500-7504. 

15 Applicat inn to RFT.P Tenhnoloav 

The use of restriction fragment length polymorphism 
(RFLP) markers in plant breeding has been well- 
documented in the art (Tanksley et al., Bio/Technology 
(1989) 7:257-264). The nucleic acid fragments of the 

20 invention can be used as RFLP markers for traits linked 
to expression of glycerolipid desaturases. These traits 
will include altered levels of unsaturated fatty acids. 
The nucleic acid fragment of the invention can also be 
used to isolate the glycerolipid desaturase gene from 

25 variant (including mutant) plants with altered levels of 
unsaturated fatty acids. Sequencing of these genes will 
reveal nucleotide differences from the normal gene that 
cause the variation. Short oligonucleotides designed 
around these differences may be used as hybridization 

30 probes to follow the variation in polyunsaturates. 

Oligonucleotides based on differences that are linked to 
the variation may be used as molecular markers in 
breeding these variant oil traits. 
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EXAMPLES 

The present invention Is further defined in the 
following Examples I' in which all. parts and percentages 
are by weight cind degrees are Celsius^ unless otherwise 
5 stated. It should be understood that these Examples, 
while indicating preferred embodiments of the invention, 
are given by way of illustration only. From the above 
discussion and these Examples, one skilled in the art 
can ascertain the essential characteristics of this 

10 invention, and without departing from the spirit and 
scope thereof, can make various changes and 
modifications of the invention to adapt it to various 
usages and conditions. All publications, including 
patents and non-patent literature, referred to in this 

15 specification are expressly incorporated by reference 
herein • 

ISOLATION OF GENOMIC DNA FLANKING THE T-DNA SITE OF 
TWgTgPTTQK T W APAftTnQPgTS THAI.TANA MOTANT T,TNE 37Q7 

20 Identification of an ar^aH-^rinpsis thaliana T-DNA Mutant 

yi^Yi -Lof T THnolen-in. Ar^id Content 

A population of 3Vr-a>^^rfr>pgis t ha liana (geographic 
race Wassilewski ja) transformants containing the T-DNA 
of fiffr-oViar^i^erium ^.ns was generated by seed 

25 transformation as described by Feldxnann et al., (Mol. 
Gen. Genetics (1987) 208:1-9). In this population the 
transformants contain DNA sequences encoding the pBR322 
bacterial vector, nopaline synthase, neomycin 
phosphotransferase (NPTII, confers kanamycin 

30 resistance)/ and b-lactamase (confers ampicillin 

resistance) within the T-DNA border sequences. The 
integration of the T-DNA into different areas of the 
chromosomes of individual transformants may cause a 
disruption of plant gene function at or near the site of 

35 insertion, and phenotypes associated with this loss of 
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gene function can be analyz d by screening the 
population for the phenotype. 

T3 seed was generated from the wild type seed 
treated with Agrobacterium tumefaciens by two rounds of 
5 self-fertilization as described by Feldmann et al,, 
(Science (1989) 243:1351-1354) . These progeny were 
segregating for the T-DNA insertion, and thus for any 
mutation resulting from the insertion. Approximately 
100 seeds of each of 6000 lines were combined and the 

10 fatty acid content of each of the 6000 pooled samples 
was deteinnined by gas chromatography of the fatty acyl 
methyl esters essentially as described by Browse et al., 
(Anal. Biochem. (1986) 152:141-145) except that 2.5% 
H2SO4 in methanol was used as the methylation reagent 

15 and samples were heated for 1.5 h at 80**C to effect the 
methanolysis of the seed triglycerides. A line 
designated **3707** produced seeds that gave an altered 
fatty acid profile compared to that of the total 
population. T3 plants were grown from individual T3 

20 seeds produced by line 3707 and self-fertilized to 

produce T4 seeds on individual plants that were either 
homozygous wild type, homozygous mutant, or heterozygous 
for the mutation. The percent fatty acid conpositions 
of a representative subsample of the entire population, 

25 of the pooled 3707 T3 seeds, and of a homozygous T4 
mutant segregant are shown in Table 4 . 
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T3 Pools Srorn. 
lines 350X-4aaO 

Patty Acid average and 3707 T3 3707 homozygous 

i«^4.hyi B^fof firn ^^^^ EaaL T4 fiffffrmnnt 

palmitic 7.4 (0.37) 7.0 6.4 

stearic 3.0 (0.22) 2.9 3.0 

oleic ".0 (1.5) 17.7 15.9 

linoleic 29.3 (0.78) 35.0 42.4 

linolenic 16-1 C1.1) 10-2 ^-^ 

eieosenoic 20.2 (0.73) 20.5 23.6 

The phenotype of the segregating T3 pool of line 3707 
(high linoleic acid, low linolenic acid) was 
intermediate between that of the population subsan^le 
5 and the homozygous T4 mutant seeds suggesting that line 
3707 harbored a mutation at a locus which controls the 
conversion of linoleic to linolenic acid in the seed. 
Still, it was not apparent whether the mutant phenotype 
in line 3707 was the result of a T-DNA insertion. 

10 Therefore, J^licants checked a segregating T4 

population to determine whether the mutant fatty acid 
phenotype cosegregated with the nopaline synthase 
activity and kanamycin resistance encoded by the T-DMA 
insert. A total of 263 T4 plants were grown and assayed 

15 for the presence of nopaline in leaf extracts 

(Brrampalli et al.. The Plant Cell (1991) 3:149-157). 
in addition, T5 seeds were collected from each of the T4 
plants and saa^lcs of 10-50 seeds were taken to 
determine the seed fatty acid composition and to 

20 determine their ability to germinate in the presence of 
kanamycin (Feldmann, et al., (1989) Science 
243:1351-1354) . The 263 plants fell into 3 classes as 
in Table 5. 
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TftBIiR S 



Number of 
Individuala 



Phenotype 



63 



T4 plants: little or no nopaline present; T5 
seeds: wild type fatty acid con^osition, all 
kanano^cin sensitive 



134 



T4 plants: nopaline present; T5 seeds: 
heterozygous fatty acid conqposition similar 
to 3707 T3 pool, segregating for kanan^cin 
resistance 



64 



T4 plants: nopaline present; T5 seeds 
homozygous mutant fatty acid conpositionr all 
kanamycin resistant 



The cosegregation of the fatty acid phenotype with the 
phenotypes conferred by T-DNA sequences in an 
approximately 1:2:1 pattern provided strong evidence 
5 that the mutation in line 3707 was the result of a T-DNA 
Insertion. Further experiments were then conducted with 
the intent of using probes containing T-DNA sequences to 
clone the T-DNA insert and flanking genomic DNA from 
line 3707. 

10 PT-i:iparaf ^nn of d^nnmic! DWA fyom HoniQT^vgous 1^707 Plants 

Seeds from a homozygous line derived from 
ATahlrfopsjc; t-haliana (geographic race Wassilewski ja 
(WS) ) line 3707 were surface sterilized for 5 min at 
room temperature in a solution of 5.25% sodium 

15 hypochlorite (w/v)/0.15% Tween 20 (v/v) , then washed 

several times in sterile distilled water, with a final 
rinse in 50% ethanol. Immediately following the ethanol 
wash, the seeds were transferred to sterile filter paper 
to dry. One to three seeds were then transferred to 

20 250-mL flasks containing 50 mL of sterile Gamborgs B5 
media (Gibcor 500-1153EA)', pH 6.0, Cultures were 
incubated at 22*^C/ 70 |IE • /m"2 • gec^^ of continuous light 
for approximately three weeks, after which time the root 
tissue was harvested, made into 10 g aliquots (wet 

25 weight), lyophilized, and stored at -20**C. 
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Using a variation of the procedixre of Shure et al., 
(Cell (1983) 35:225-233) gencanic DNA was isolated from 
the root tissue. T%«> aliquots of lyophilized tissue 
were groiind to a fine powder using a mortar and pestle. 
5 The ground tissue was added to a flask containing 85 mL 
of lysis buffer (7 M urea, 0.35 M NaCl, 0.05 M Tris-HCl, 
pH 8.0, 0.02 M EDTA, 1% Sarkosyl, 5% phenol) and mixed 
gently with a glass rod to obtain a homogeneous 
suspension. To this suspension an equal volume of 

10 phenol: chloroform: isoanorl alcohol (25:24:1) 

(equilibrated with 10 nM Tris, pH 8, 1 mM EDTA) was 
added. After the addition of 8.5 mL of 10% SDS the 
mixture was swirled on a rotating platform for 15 min at 
room temperature. After centrifugation at 2000xg for 15 

15 min, the upper aqueous phase was removed to a new tube 
and extracted two more times, as above, but without the 
addition of SDS. To the final aqueous phase was added 
l/20th the volume of 3 M potassium acetate, pH 5.5 and 
two times the volume of ice cold 100% ethanol. 

20 Precipitation of the DNA was facilitated by incubation 
at -2a*C for one hour followed by centrifugation at 
12,000xg for 10 min. The resulting pellet was 
resuspended in 3 mL of 10 mM Tris, pH 8, 1 mM EDTA to 
which was added 0.95 g of cesium chloride (CsCD and 

25 21.4 |IL of 10 mg/mL ethidium bromide (BtBr) per mL of 
solution. The DNA was then purified by centrifugation 
to equilibrium in a CsCl/EtBr density gradient for 16 h 
at IS^C, 265,000xg. After removal from the gradient, 
the DNA was extracted with isopropanol saturated with TE 

30 buffer (10 mM Tris, pH 8; 1 mM EDTA) and CsCl to remove 
EtBr and then dialyzed overnight at 4«C against 10 mM 
Tris, pH 8, 1 mM EDTA to remove CsCl. The DNA was 
removed from dialysis and the concentration was 
determined using the Hoechst fluorometric assay in which 

35 an aliquot of DNA is added to 3 mL of 1.5 X lO"? if bis- 
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benzlmide (Hoechst 33258, Slga) In IX SSC (0.15 M NaCl, 
0.015 M sodium citrate), pH 7.0, incubated at room 
teit^erature for 5 min, and read on a fluorometer at 
excitation 360, emission 450, against a known set of DNA 
5 standards • 

Plaflmid Resfftie and Analysis 
Five micrograms of genomic DNA from the homozygous 
3707 mutant, prepared as described above, was digested 
with 20 units of either Bam HI or Sal I restriction 

10 enzyme (Bethesda Research Laboratory) in a 50 px 
reaction volume according to the manufacturer's 
specifications. After digestion the DNA was extracted 
with buffer-saturated phenol (Bethesda Research 
Laboratory) followed by precipitation in ethanol. The 

15 resulting pellet was resuspended in a final volume of 10 
pii of 10 mM Tris, pH 8, and the concentration of the DNA 
was determined using the Hoechst fluorometric assay as 
above . 

To facilitate circularization, as opposed to end- 

20 to-end joining, a dilute ligation reaction was set up 
containing 250 ng of Bam HI or Sal I digested genomic 
DNA, 3 Weiss units of T4 DNA ligase (Promega) , 50 px of 
lOX ligase buffer (30 mM Tris-HCl, pH 7.8, 100 mM MgClz/ 
100 tM DTT, 5 mM ATP) and 5 \IL of 100 mM ATP in a 500 \LL 

25 reaction volume. The reaction was incubated for 16 h at 
16**C, heated for 10 min at 70*^0, and extracted once with 
buffer saturated phenol (Bethesda Research Laboratory) . 
The DNA was then precipitated with the addition of two 
volumes of 100% ethanol and 1/lOth volume of 7.5 M 

30 ammonium acetate. The resulting pellet was resuspended 
in a final volume of 10 ^IL of 10 mM Tris, pH 8, and the 
concentration of the DNA was determined using the 
Hoechst fluorometric assay as above. 

Coxnpetent DHIOB cells (Bethesda Research 

35 Laboratory) were transfected with 50 ng of ligated DNA 
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at a concentration of 10 ng of DNA per 100 of cells 
according to the manufacturer's specifications. 
Transfonnants from Sal I or Bam HI digests were selected 
on LB plates (10 g Bacto-tryptone, 5 g Bacto-yeast 
5 extract, 5 g NaCl, 15 g agar per liter, pH 7.4) 

containing 100 jig/mL ampicillin or 25 jig/mL kanamycin 
sulfate, respectively. An5>icillin-resistant llaap^; 
an«>icillin sensitivity. Amp*) Sal I tranformants were 
screened for the presence of the kanamycin resistance 

10 {Kani^; kanamycin sensitivity, Kan») gene by picking 
primary tranformants and stabbing them first to LB 
plates containing 100 |tg/mL ampicillin then to LB plates 
containing 25 |ig/mL kanamycin. After overnight 
incubation at 37*»C the plates were scored for AmpVKans 

15 colonies. Kanamycin-resistant Bam HI transformants were 
screened for the presence of the an5>icillin resistance 
gene by picking primary transformants and stabbing th«ti 
first to LB plates containing 25 (ig/mL kanamycin and 
then to LB plates containing 100 jig/mL ampicillin. 

20 After overnight incubation at ST'C the plates were 
scored for Kan=/Amp>^ colonies. 

Cultures were made of 192 Amp'^/Kans sal I 
transformants and 85 Kan=/Ampi^ Bam HI transformants 
directly into deep-well microtiter plates containing 

25 200 JIL of LB broth (10 g Bacto-tryptone, 5 g Bacto-yeast 
extract, 5 g NaCl per liter) with 100 Jig/iBL an^jiciHin. 
Using the Schleicher and Schuell Minifold t apparatus 
and Nytran menibranes, dot blots were set up, in 
duplicate, using the following conditions: 50 JlL of 

30 culture was diluted into 150 JIL of SX SSC, the culture 
was lysed and the DNA denatured by the addition of 
150 JlL of 0.5 M NaOH, 1.5 M NaCl solution for 3 min at 
room ten5>erature, the filter was removed from the 
apparatus and neutralized in 0.5 M Tris, pH 8, 1.5 M 

35 NaCl, the DNA was then OV cross-linked to the filters 
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using the Stratagene Stratalinker, and the filters were 
heated for 2 h at 80**C and stored at room temperature. 

To determine whether T-DNA was contained within any 
of the rescued plasmids, the dot blots were probed with 
5 portions of the right and left borders of T-DNA, The 
right border probe consisted of a 2.2 kb Hind III-Dra I 
fragment of DNA obtained from plasmid K23pKC7 (composed 
of the 3.2 kb Hind III 23 fragment from Ti plasmid 
PT1C58 (Lemmers et al.^ J. Mol. Biol. (1989) 

10 144; 353-37 6} cloned into plasmid vector pKC7 (Manlatis 
et al.r Molecular Cloning, A Laboratory Manual (1982) 
Cold Spring Harbor Laboratory Press) ) , and the left 
border probe consisted of a 2.9 kb Hind III-Eco RI 
fragment obtained from plasmid HlOpKC? (composed of the 

15 6.5 kb Hind III 10 fragment from Ti plasmid pTiC58 
(Lemmers et al., J. Mol. Biol. (1989) 144:353-376) 
cloned into plasmid vector pKC7 (Maniatis et al.. 
Molecular Cloning, A Laboratory Manual (1982) Cold 
Spring Harbor Laboratory Press) ) using standard 

20 digestion, electrophoresis, and electroelution 

conditions as described in Scimbrook et al . , (Molecular 
Cloning, A Laboratory Manual, 2nd ed (1989) Cold Spring 
Harbor Laboratory Press) . Final DNA purification was 
obtained by passage of the eluted DNA over an Elutip-D 

25 column (Schleicher and Schuell) using the manufacturer's 
specifications. Concentration of the DNA was determined 
using the Hoechst fluorometric assay as above. 
Approximately 100 ng of each probe was labeled with 
al32p]dCTP using a Random Priming Kit from Bethesda 

30 Research Laboratories under conditions recommended by 
the manufacturer. Label-ed probe was separated from 
unincorporated a[32p]dCTP by passing the reaction 
through a Sephadex (5-25 spun column under standard 
conditions as described in Sambrook et al., (Molecular 
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Cloning, A Laboratory Manual, 2nd ed. (1989) Cold Spring 
Harbor Laboratory Press) . 

ThB filters were pre-hybridized in 150 mL of buffer 
consisting of 6X SSC, lOX Denhardt 's solution, 1% SDS, 
5 and 100 flg/mL denatured calf thymus DNA for 16 h at 

42^C. The denatured/ purified, labeled probe was added 
to the pre-hybridized filters following transfer of the 
filters to 50 mL of hybridization buffer consisting of 
6X SSC, 1% SDS, 10% dextran sulfate, and 50 |ig/inL 

10 denatured calf thymus DNA. Following incubation of 1:he 
filters in the presence of the probe for 16 h at 65**C, 
the filters were washed twice in 150 of SX SSC, 0.5% 
SDS, twice in IX SSC, 1% SDS and once in O.IX SSC, 1% 
SDS, all at 65**C. The washed filters were subjected to 

15 autoradiography on Kodak XAR-2 film at 80**C overnight. 

Of the 85 Bam HI candidates, 63 hybridized with the 
left border probe and none hybridized with the right 
border probe. Of the 192 Sal I candidates, 31 
hybridized with the left border probe, 4 hybridized with 

20 the right border .probe, and none hybridized with both 

probes. Twelve of the Bam HI candidates, 7 positive and 
5 negative for the presence of the left border of T-DNA, 
were further analyzed by restriction digests. 

DNA from the Bam HI candidates was made by the 

25 alkaline lysis miniprep procedure of Birmbiom et al., 
(Nuc. Acid Res. (1979) 7:1513-1523), as described in 
Sambrook et al., (Molecular Cloning, A Laboratory 
Manual, 2nd ed. (1989) , Cold Spring Harbor Laboratory 
Press) . The plasmid DNA was digested with Eco RI 

30 restriction enzyme (Bethesda Research Laboratories) in 
accordance with the manufacturer's specifications and 
electrophoresed through a 0.8% agarose gel in IX TBE 
buffer (0.089 M Tris-borate, 0.089 M boric acid, 0.002 M 
EDTA) . All of the Bam HI candidates which hybridized 

35 with the left border probe of T-DNA had the same Eco RI 
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restriction pattern, which indicated the presence of 
14.2 kb of T-DNA and 1.4 kb of putative plant genomic 
DNA in these clones. 

DNA from Sal I candidates was isolated, 
5 restriction-analyzed using Eco RI, Bam HI and Sal I 

enzymes, and electrophoresed through a 0.8% agarose gel, 
as above. All of the Sal I candidates which hybridized 
with the left border probe of T-DNA included 2.9 kb of 
putative plant DNA. Contained within this 2.9 kb 

10 fragment was a 1.4 kb Bam HI-Eco RI fragment as seen 
with the Bam HI rescued plasmids, suggesting that the 
1.4 kb fragment was a subset of the 2.9 kb fragment and 
that it was adjacent to the left border of the T-DNA at 
its site of insertion into the plant genome- Sequence 

15 analysis of one Sal I candidate (pSl) using a primer 
homologous to the left border sequence of T-DNA, 
revealed that the sequence of pSl was colinear with the 
sequence of the T-DNA left border (Yadav et al., Proc. 
Natl. Acad. Sci. USA (1982) 79:6322-6326) up to 

20 nucleotide 65, followed by non-T-DNA (putative plant) 
sequences • 

Southern Analysis with Putative Plant 
nWA from Tl^^^nu^.d Plasmids 
DNA from the seven Bam HI candidates which 

25 hybridized with the left border of the T-DNA was pooled 
and a portion was digested with Eco RI and Bam HI 
restriction endonucleases and electrophetically 
separated on a 0.8% agarose gel in IX TBE buffer. After 
excising a 1.4 kb Eco RI-Bam HI fragment from the 

30 agarose gel, the 1.4 kb fragment was purified by use of 
a Gene Clean Kit from Bio 101. Fifty nanograms of the 
resulting DNA fragment was labeled with a[32p]dCTP using 
a Random Priming Kit (Bethesda Research Laboratory) 
under conditions recommended by the manufacturer. 
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Three micrograms of total genomic DNA from 
homozygous wild-type AT^ahidopsis and homozygous 3707 
mutant ar>aV>lgfopsis plants was digested to completion 
with one of the following restriction enzymes? Sal 
5 Hind III, Eco RI, Cla I, and Bam HI under conditions 
suggested by the manufacturer. The digested DNA was 
sxabjected to electrophoresis and Southern transfer to 
Hybond-N membranes (Amersham) as described in Sambrook 
et al. (Molecular Cloning, A Laboratory Approach, 2nd. 
10 ed. (1989) Cold Spring Harbor Laboratory Press) . After 
Southern transfer, the membranes were exposed to UV 
light using the Stratalinker (Stratagene) as per the 
manufacturer's instructions > air dried, and heated at 
68*^0 for 2 h. 

15 The filters were prehybridized in 1 M NaCl, 50 mM 

Tris-Cl, pH 7.5, 1% sodium dodecyl sulfate, 5% dextran 
sulfate, 100 jig/mL of denatured salmon sperm DNA at 65*^C 
overnight • Fifty nanograms of the radiolabeled 1-4 kb 
Eco RI-Bam HI plant DNA fragment prepared above was 

20 added to the prehybridization solution containing the 
Southern blot and further incxibated at 65**C overnight. 
The filter was washed for 10 min twice in 200 mL 2X 
SSPE, 0.1% sodium dodecyl sulfate at 65**C and for 10 min 
in 20a mL 0.5% SSPE, 0.1% sodium dodecyl sulfate at 

25 65**C. Hybridizing fragments were detected by 

autoradiography. The analysis confirmed that the probe 
fragment contained plant DMA and that the T-DNA 
integration site was in a 2.8 kb Bam HI, a 5.2 kb Hind 
III, a 3.5 kb Sal I, a 5.5 kb Eco RI, and an 

30 approximately 9 kb Cla I fragment of wild type 

Arabidopsis DNA. 

Isolation of Lambda Clones Containing the Wild Type 
ft rf^^Srinpfijc , ni^ii-a-lS n^c;at:uCTfift Gene 

The 1.4 kb Eco RI-Bam HI fragment (see above) was 
35 used as a probe to screen a lGemr-11 library made from 
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genomic DNA Isolated from wildtype Arabidopgia t-Viai lana 
plants/ geographic race WS. To construct the library/ 
genomic DNA was partially digested with Sau3A enzyme, 
and size-fractionated over a salt gradient as described 
5 in Sambrook et al, (Molecular Cloning, A Laboratory 
Approach/ 2nd ed. (1989) Cold Spring Harbor Laboratory 
Press) . The size- fractionated DNA was then cloned into 
Bam Hl-dlgested lGem-11 phage DNA (Promega) following 
the protocol outlined by the manufacturer. About 25/000 

10 plaque-forming units of phage each were plated on five 
150 mm petrl plates containing a lawn of KW251 cells on 
NZY agar media (5 g NaCl/ 2 g MgS04-7H20, 5 g yeast 
extract/ 10 g NZ Amine (casein hydrolysate from ICN 
Pharmaceuticals), 15 g agar per liter; pH 7^5). The 

15 plaques were adsorbed onto nylon membranes 

(Colony/Plaque Screen, New England Nuclear) , in 
duplicate/ and prepared according to the manufacturer's 
instructions with the addition of a 2 h Incubation at 
80**C after air drying the filters. The filters were 

20 prehybrldlzed at 65**C In hybridization buffer (1% BSA/ 
0.5 M NaPl/ pH 7.2/ (NaH2P04 and Na2HP04)/ 10 mM EDTA/ 
and 7% SDS) for 4 h/ after which time they were 
transferred to fresh buffer containing the denatured 
radiolabeled probe (see above) and incubated overnight 

25 at 65^C. The filters were rinsed twice with O.IX SSC/ 
1% SDS at 65**C for 30 mln each and subjected to 
autoradiography on Kodak XA-R film at 80**C overnight. 
Seven positively-hybridizing plaques were subjected to 
plaque purification as described in Sambrook et al./ 

30 (Molecular Cloning, A Laboratory Manual/ 2nd ed. (1989)/ 
Cold Spring Harbor Laboratory Press) . 

Small scale (5 mL) liquid lysates from each of the 
7 clones were prepared and titered on KW251 bacteria as 
described in Sambrook et al. (Molecular Cloning/ A 

35 Laboratory Manual, 2nd ed (1989)/ Cold Spring Harbor 
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Laboratory Press) . Phage DNA was isolated using a 
variation of the method of Chisholm (Biotechniques 
(1989) 7:21-23) in which the initial lysate was made 
according to Sambrook et al. (Molecular Cloning^ A 
5 Laboratory Manual, 2nd ed (1989), Cold Spring Harbor 

Laboratory Press) the concentration of DNase I and RNase 
I (Sigma) was reduced by half, and the PEG precipitation 
step was increased to 16 h. Based on restriction 
analysis using Hind III, Sal I Mid Xho X enzymes, the 

10 original 7 positive phage fell into 5 different classes. 
While the average insert size was approximately 15 kb, 
taken together the clones spanned a 40 kb region of 
genomic DNA. Through restriction mapping using 4 
different enzymes (Hind III, Bam HI, Kpn I, and Sal I) 

15 singly, and in pair-wise combinations £ accompanied by 
Southern analysis with the 1.4 kb Eco RI-Bam HI probe 
(as above) and other probes obtained from the 1' clones 
themselves, a partial map was obtained in which all 5 
clones (11111, 141A1, 14211, 14311 and 14411) were found 

20 to share an approximately 3 kb region of homology near 
the site of T-DNA insertion. Via restriction and 
Southern analysis. Applicants ascertained that a 5.2 3cb 
Hind III fragment present in clones 1111, 41A1, and 4411 
also spanned the site of the T-DNA insertion. This 

25 fragment was excised from lambda clone 41A1, inserted 
into the Hind III site of the pBluescript vector 
(Stratagene) , and the resulting plasmld, designated pFl, 
was prepared and isolated using standard protocols. 
This Hind III fragment was subsequently used to probe an 

30 ai^«H4rf»pgis cDNA library (see below) . 
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EXftMPLE 2 

CLONING OF ARABTDQPSTS THALIANft DELTA-15 
DESATURASE CDNA USING GENOMIC DNA FLANKING 
THE T-DNA SITE OF INSERTION IN ARABTDQPSTS THALIflNA 
5 Mt7TA>JT LTW K 3707 AS A HYBRIDIZATION PROBE 

The 5.2 kb Hind III fragment from plasmld pFl was 
purified by electrophoresis in agarose after digestion 
of the plasmid with Hind III and radiolabeled with 32p 
as described above. For the preparation of an 

10 Ayah-LrfQpgls cDNA library, polyadenylated mRNA was 

prepared from 3 day-old, etiolated Arahjdopsics (ecotype 
Columbia) seedling hypocotyls using standard protocols 
(Sambrook, et al., Molecular Cloning: A Laboratory 
Manual, 2nd Ed. (1989) Cold Spring Harbor Laboratory 

15 Press) - Five micrograms of this mRNA were used as 

template with an oligo d(T) primer, and Moloney Murine 
Leukemia Virus reverse transcriptase (Pharmacia) was 
used to catalyze first strand cDNA synthesis. Second- 
strand cDNA was made according to Gubler et al., (Gene 

20 (1983) 25:263-272) except that DNA ligase was omitted. 
After the second strand synthesis, the ends of the cDNA 
were made blunt by reaction with the Klenow fragment of 
DNA polymerase and ligated to Eco RI/Not I adaptors 
(Pharmacia) . The cDNA's were purified by spun-column 

25 chromatography using Sephacryl S-300 and size- 

fractionated on a 1% low melting point agarose gel. 
Size-selected cDNAs (1-3 kb) were removed from the gel 
using agarase (New England Biolabs) and purified by 
phenol : chloroform extraction and ethanol precipitation. 

30 One hundred nanograms of the cDNA was co-precipitated 
with 1 Jig of 1 ZAP II (Stratagene) Eco Rl-digested, 
dephosphorylated arms. The DNAs were ligated in a 
volume of 4 \JLL overnight, and the ligation mix was 
packaged In vitro using the Gigapack II Gold packaging 

35 extract (Stratagene) . 
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Approxlxnately 80,000 phage were screen d for 
positively hybridizing plaques using the radiolab led 
5.2 kb Hind III fragment as a probe essentially as 
described above and in Sambrook et al.^ (Molecular 
5 Cloning: A Laboratory Manual, 2nd ed. (1989) Cold 

Spring Harbor Laboratory Press) . Replica filters of the 
phage plaques were soaked in 1 M NaCl, 50 mM Tris-HCl, 
pH 7.5, 1% SDS, 5% dextran sulfate, 0.1 ing/mL denatured 
salmon sperm DNA during the pre-hybridization step (8 hr 

10 at 65**C) axid then probe was added and the hybridization 
proceeded over 16 hr at the same tenperature. Filters 
were washed sequentially with 2X SSPE, 0.1% SDS at room 
teii5)erature for 5 min and then again with fresh solution 
for 10 min, and finally with 0.5X SSPE, 0.1% SDS at 65^C 

15 for 5 min. Approximately 20 positively hybridizing 

plaques were identified in the primary screen. Four of 
these were picked and subjected to two further rounds of 
screening and purification. From the tertiary screen, 
four pure phage plaques were isolated. Plasmid clones 

20 containing the cDNA inserts were obtained through the 
use of a helper phage according to the In sdJCfi excision 
protocol provided by Stratagene. Double-stranded DNA 
was prepared using the alkaline lysis method as 
previously described, and the resulting plasmids were 

25 size-analyzed by electrophoresis in agarose gels. The 
largest one of these, designated pCF3, contained an 
approximately 1.4 kb insert which was sequenced using 
Sequenase T7 DHA polymerase (OS Biochemical Corp.) and 
the manufacturer's instructions, beginning with primers 

30 homologous to vector sequences that flank the cDNA 

insert and continuing serially with primers designed 
from the newly acquired sequences as the sequencing 
experiment progressed. The sequence of this insert is 
shown in SEQ ID N0:1. 
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EXAMPLE 3 

CLONING OF AN ARABTDQP5IS CDNA ENCODING A PLASTID 

FATTY ACID DESATORASE 

A related fatty acid desaturase was cloned in a 
5 similar fashion^ except that the probe used was not 

derived from a PCR reaction on pCF3, but rather was the 
actual 1.4 kb Not I fragment Isolated from pCF3 which 
was purified and radiolabeled as described above. 

Approximately 80,000 phage from the Arabidepsis 

10 etiolated hypocotyl cDNA library described above were 

plated out and screened essentially as before, except as 
indicated below. The filters were soaked in 1 M NaCl, 
50 mM Tris-HCl/ pH 7.5, 1% SDS, 5% dextran sulfate, 
0.1 mg/mL denatured salmon sperm DNA during the pre- 

15 hybridization step (8 hr at 50**C) . Then probe was added 
and the hybridization proceeded over 16 hr at the same 
tenqperature . Filters were washed sequentially with 2X 
SSFE, 0.1% SDS at room temperature for 5 min and then 
again with fresh solution for 10 min, and finally with 

20 0.5X SSPE, 0.1% SDS at 50**C for 5 min. Approximately 17 
strongly hybridizing and 17 weakly hybridizing plaques 
were Identified in the primary screen. Four of the 
weakly hybridizing plaques were picked and subjected to 
one to two further rounds of screening with the 

25 radiolabeled probe as above until they were pure. To 
ensure that these were not delta- 15 desaturase clones, 
they were further analyzed to determine whether they 
hybridized to a delta-^15 desaturase 3* end-specific 
probe. The probe used was an 18 bp oligonucleotide 

30 which is complementary in sequence (i.e., antlsense) to 
nucleotides 1229 - 1246 of SEQ ID N0:1. The probe was 
radiolabeled with gamma-^^p atp using T4 polynucleotide 
kinase and hybridized to filters containing DNA from the 
isolated clones in 6X SSC, 5X Denhardt's, 0.1 mg/mL 

35 denatured salmon sperm DNA, 1 iriM EDTA, 1% SDS at 44^C 
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overnight. The filters were washed twice in 6X SSC, 
0.1% SDS for 5 min at room temperature/ then in 6X SSC, 
0.1% SDS at 44*C for 3-5 min. After autoradiography of 
the filters, one of the clones failed, to show 
5 hybridization to this probe. This clone was picked, and 
a plasmid clone containing the cDNA insert was obtained 
through the use of a helper phage according to the ia 
vivo excision protocol provided by Stratagene. Double- 
stranded DNA was prepared using the alkaline lysis 

10 method as previously described, and the resulting 

plasmid was size-analyaed by electrophoresis in agarose 
gels following either Not I digestion or digestion with 
both Nco I and Bgl II. The results were consistent with 
the presence in this plasmid, designated pCM2, of an 

15 approximately 1.3 kb cDNA insert which lacked a 0.7 kb 
Nco I - Bgl II fragment characteristic of the 
ar-«>^-i linos is delta-15 desaturase cDNA of pCF3. (This 
fragment corresponds to the DNA located between the Nco 
I site at nucleotides 474-479 and the Bgl II site at 

20 nucleotides 1164-1169 in SBQ ID NO:l>. The complete 
nucleotide sequence of pCM2 is shown in SBQ ID NO: 4. 

lavaMPTA 4 

CLONING OF PLANT FATTY ACID DESATURASE cDNAs 
gPHM fiTHER gPPrTBR R V WYBBTnTZATTON TECHNIQPES 

25 An approximately 1.4 kb fragment containing the 

ftT> f|K^rtftpsls delta-15 desaturase coding sequence of SEQ 
ID NO:l was obtained from plasmid pCFS through the use 
of the polymerase chain reaction (PCR) . Primers 
(M13(-20) and T7-17mer primers, 1991 Stratagene 

30 Catalogue numbers 300303 and 300302, respectively) 

flanking the pCF3 insert vere used in the PCR which was 
carried out essentially as described in the instructions 
provided by the vendor in the Perkin-Elmer/Cetus PCR 
kit. This fragment was digested with Not I to remove 

35 vector sequences, purified by agarose gel electro- 
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phoresis, and radiolabeled with 32p as previously 
described . 

F.yftMPIiE 5 

CLONING OF BPARSTCA NAPUS SEED cDNAs ENCODING 
5 nBT.T&-1S FaTTV Arm nKSATTTRASES 

A cDNA library from developing Brassica napus seeds 
was constructed using the polyadenylated inRNA fraction 
contained in a polysomal RNA preparation from developing 
Brassica napus seeds. Polysomal RNA was isolated 

10 following the procedure of Kamalay et al., (Cell (1980) 
19:935-946) from seeds 20-21 days after pollination. 
The polyadenylated mRNA fraction was obtained by 
affinity chromatography on oligo-dT cellulose (Aviv et 
al., Proc. Natl. Acad. Sci. USA (1972) 69:1408-1411). 

15 Four micrograms of polyadenylated mRNA were reverse 
transcribed and used to construct a cDNA library in 
lambda phage (Uni-ZAP™ XR vector) using the protocol 
described in the ZAP-cDNA™ Synthesis Kit (1991 
Stratagene Catalog, Item # 200400) . 

20 For the purpose of cloning the Brassica napus seed 

CDNAs encoding delta-15 fatty acid desaturases, the 
Rt-assiria napus seed cDNA library was screened several 
times using the inserts from the ftrflhidffpsis cDNAs pCF3 
and pCM2 as radiolabelled hybridization probes. One of 

25 the wr-apsstea nwpns cDNAs obtained in these screens was 
used as hybridization probe in a subsequent screen. 

For each screening experiment approximately 300,000 
phages were screened under low stringency hybridization 
conditions. The filter hybridizations were carried out 

30 in 50 mM Tris pH 7.6, 6X SSC, 5X Denhardt's, 0.5% SDS, 
100 ug denatured calf thymus DNA at 50*'C overnight and 
the p[ost hybridization washes were performed in 6X SSC, 
0.5% SDS at room temperature for 15 min, then repeated 
with 2X SSC, 0.5% SDS at 45*C for 30 min, and then 
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repeat^ed twic with 0.2X SSC« 0*5% SPS al: SO^'C for 30 
mln. 

Using the AyahirfQpsifi cDNA insert of pCM2 as a 
probe dLn a low stringency screen five strongly 
5 hybridizing phages were identified. These phages were 
purified and excised according to the protocols 
described in the ZAP-cDNA'^ Synthesis Kit and pBluescript 
II Phagemid Kit (1991 Stratagene Catalog, Item # 200400 
and 212205) . One of these, designated pBNSF3-f2r 
10 contained a 1.3 3dD insert. pBNSF3--f2 insert was 
sequenced conpletely on both strands. pBNSF3-f2 
nucleotide sequence is shown in SEQ ID NOt&. .A 
comparison of this sequence with that of the Ayabldopsis 
irhaliana delta-15 desaturase clone (SEQ ID NO:l) 
15 confirmed that pBNSF3-f2 is a Brassica napus cDNA that 
encodes a seed microsomal delta-15 desaturase. 

An additional low stringency screen of the Bgasglca 
na pus seed cDNA library using the cDNA insert in pCM2 as 
a probe identified eight strongly-hybridizing phages. 
20 These phages were plaque purified and used to excise the 
phagemids as described above. One of these, designated 
pBNSFd-8, contained a O.Skb insert, pBNSFd-8 was 
sequenced completely on one strand, this sequence had 
significant divergence from the sequence of pBNSF3-f2. 
25 The cDNA insert in pBNSFd-8 was used as a hybridization 
probe in a high stringency screen of the Brasaica napus 
seed CDNA library. The filter "hybridizations were 
carried out in 50 mM Tris pH 7.6^ 6X SSC, SX Denhardt's, 
0.5% SDSr 100 ug denatured calf thymus DNA overnight at 
30 50**C and post hybridization washes were in 6X SSCr 0.5% 
SDS at room temperature for 15 min, then with 2X SSC, 
0,5% SDS at 45*^0 for 30 min^ and then twice with 0.2X 
SSC, 0.5% SDS at SO^'C for 30 min. The high stringency 
screen resulted in three strongly hybridizing phages 
35 that were purified and excised as above. One of the 
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excised plasznlds pBNSFd-3 contained a 1.4kb insert that 
was sequenced coinpletely on both strands. SEQ ID NO: 8 
shows the nucleotide sequence of pBNSFd^S . A comparison 
of this sequence with that of the Arabidopsis thaliana 
5 delta-15 desaturase clone (SEQ ID NO: 4) confirmed that 
pBNSFd-3 is a Brassica nsEua cDNA that encodes a seed 
plastid delta-15 desaturase. 

Cloning of a Soybean Seed cDNA Encoding a 
MlrirQfiemal Del1:a -Ti Clvgerolipid Desaturase 

10 A cDNA library was made as follows: Soybean 

embryos <ca. 50 mg fresh weight each) were removed from 
the pods and frozen in liquid nitrogen. The frozen 
embryos were ground to a fine powder in the presence of 
liquid nitrogen and then extracted by Polytron 

15 homogenization and fractionated to enrich for total RNA 
by the method of Chirgwin et al. (Biochemistry (197 9) 
18:5294-5299). The nucleic acid fraction was enriched 
for poly A+RNA by passing total RNA through an oligo-dT 
cellulose column and eluting the poly A+RNA with salt as 

20 described by Goodman et al. (Meth. Enzymol. (1979) 

68:75-90). cDNA was synthesized from the purified poly 
A+RNA using cDNA Synthesis System (Bethesda Research 
Laboratory) and the manufacturer's instructions. The 
resultant double-stranded DNA was methylated by Eco RI 

25 DNA methylase (Promega) prior to filling-in its ends 
with T4 DNA polymerase (Bethesda Research Laboratory) 
and blunt-end ligation to phosphorylated Eco RI linlcers 
using T4 DNA ligase (Pharmacia) . The double-stranded 
DNA was digested with Eco RI enzyme, separated from 

30 excess linJcers by passage through a gel filtration 
column (Sepharose CL-4B)-, and ligated to lambda ZAP 
vector (Stratagene) according to manufacturer's 
instructions. Ligated DNA was packaged into phage using 
the Gigapack packaging extract (Stratagene) according to 

35 manufacturer's instructions. The resultant cDNA library 
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was antpllfied as per Stratagene*s instructions and 
stored at -80*C. 

Following the instructions in the Lambda ZAP 
Cloning Kit Manual (Stratagene) , the cDNA phage library 
5 was used to infect E. soli. BB4 cells and approximately 
80,000 plaque forming units were plated onto 150 mm 
diameter petri plates. Duplicate lifts of the plates 
were made onto nitrocellulose filters (Schleicher & 
Schuell) . The filters were prehybridized in 25 mL of 

10 hybridization buffer consisting of 50mM Tris-HCl, pH 

7.5, 1 M NaCl, 1% SDS, 5% dextran sulfate and 0.1 mg/mL 
denatured salmon sperm DNA (Sigma Chemical Co.) at SCC 
for 2 h. Radiolabeled probe prepared from pCP3 as 
described above was added, and allowed to hybridize for 

15 18 h at 50**C. The probes were washed twice at room 
temperature with 2X SSPE, 1% SDS for five minutes 
followed by washing for 5 min at 50°C in 0.2X SSPE, 1% 
SDS. Autoradiography of the filters indicated that 
there was one strongly hybridizing plaque, and 

20 approximately five weakly hybridizing plaques. The more 
strongly hybridizing plaque v&s subjected to a second 
round of screening as before, excepting that the final 
wash was for 5 min at 60*0 in 0.2X SSPE, 1% SDS. 
Numerous, strongly hybridizing plaques were observed, 

25 and one, well-isolated from other phage, was picked for 
further analysis. 

Following the Lambda ZAP Cloning Kit Instruction 
Manual (Stratagene), sequences of the pBiuescript 
vector, including the cDNA inserts, from the purified 

30 phage was excised in the presence of a helper phage and 
the resultant phagemid wa« used to infect R. saXL XL-1 
Blue cells. DNA from the plasmid, designated pXFl, was 
made by the alkaline lysis miniprep procedure described 
in Sambrook et al. (Molecular Cloning, A Laboratory 

35 Manual, 2nd ed. (1989) Cold Spring Harbor Laboratory 



wo 93/1 1245 PCr/US92/102«4 

83 

Press) . The alkali-denatured double-stranded DNA from 
pXFl was conpletely sequenced on both strands. The 
insert of pXFl contained a stretch of 1783 nucleotides 
which contained an unknown open-reading frame and also 
5 contained a poly-A stretch of 16 nucleotides 3* to the 
open reading frame, from nucleotides 17 67 to 1783, 
followed by an Eco RI restriction site. The 2184 bases 
that followed this Eco RI site contained a 1145 bp open 
reading frame which encoded a polypeptide of about 68% 

10 identity to# and colinear with, the Arabidopsis delta-15 
desaturase polypeptide listed in SEQ ID No: 2. The 
putative start methionine of the 1145 bp open-reading 
frcune corresponded to the start methionine of the 
Arabidopsis microsomal delta-15 peptide and there were 

15 no amino acids corresponding to a plastid transit 

peptide 5' to this methionine. When the insert in pXFl 
was digested with Eco RI four fragments were observed, 
fragments of approximately 370 bp and 1400 bp fragments, 
derived from the first 1783 bp of the insert in pXFl, 

20 and fragments of approximately 600 bp and 1600 bp 
derived from the the other 2184 nucleotides of the 
insert in pXFl. Only the 600 bp and 1600 bp fragments 
hybridized with probe derived from pCF3 on Southern 
blots. It was deduced that pXFl contained two different 

25 cDNA inserts separated by an Eco RI site and the second 
of these inserts was a 2184 bp cDNA encoding a soybean 
microsomal delta-15 desaturase. The con5>lete nucleotide 
sequence of the 2184 bp soybean microsomal delta-15 cDNA 
contained in plasmid pXFl is listed in SEQ ID No: 10. 

30 Cloning of a Soybean Seed cDNA Encoding a Plastid 

Delta-15 Glycerolipid Desaturase Using 
Soybean Microsomal Delta-15 Desaturase cDNA 
an Hyhridi7:aMnn Probe 

A 1.0 kb fragment of the coding region of the 
35 soybean microsomal delta-15 desaturase cDNA contained in 
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plasxnid pXFl was excised by digestion with: the 
restrict i n enzyme Hha I. Tliis 1-^0 Kb fragment was 
purified by agarose gel electrophoresis and radiolabeled 
with 32P as previously described. The radiolabeled 
5 fragment was used to screen 100,000 plaque-forming units 
of the the soybean cDNA library as described above. 
Autoradiography of the filters indicated that there were 
eight hybridizing plaques and these were subjected to a 
second round of screening. Sequences of the pBluescript 

10 vector from all eight of the purified phages, including 
the cDNA inserts, were excised in the presence of a 
helper phage and the resultant phagemids were used to 
infect EL. eoii XL-1 Blue cells. DNA from the plasmids 
was made by the alkaline lysis miniprep procedure 

15 described in Sambrook et al. (Molecular Cloning, A 
Laboratory Manual, 2nd ed. (1989) Cold Spring Harbor 
Laboratory Press) . Restriction analysis showed they 
contained inserts ranging from 1.0 kb to 3.0 kb in size. 
One of these inserts, designated pSFD-118bwpr contained 

20 an insert of about 1700 bp. The alkali-denatured 
double-stranded DNA from pSFD-118bwp was completely 
sequenced on both strands. The insert of pSFD-118bwp 
contained a stretch of 1675 nucleotides which contained 
an open-reading frame encoding a polypeptide of about 

25 80% identity with, and colinear with, the Arabidopsis 

plastid delta-15 desaturase polypeptide listed in SEQ ID 
No: 5. The open-reading frame also encoded amino acids 
corresponding to a plastid transit peptide at the 5* end 
of the open-reading frame. The transit peptide was 

30 colinear with, and shared some homology to, the transit 
peptide described for the Ai^a>>irfnpsis plastid delta-15 
glycerolipid desaturase. Based on the homology to 
y^i-aV^-griQPsis plastid delta-15 glycerolipid desaturase and 
because of the presence of a plastid transit peptide, 

35 the CDNA contained in plasmid pSFD-118bwp was deduced to 
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be a soybean plastld delta-15 glycerolipid desaturase. 
The coxiqplete nucleotide sequ nee of the 1675 bp soybean 
plastid delta-15 glycerolipid desaturase cDNA is listed 
in SEQ ID NO: 12. 
5 EXAMPLE 6 

CLONING OF CDNA SEQUENCES ENCODING FATTY ACID 
HRfiATTTRAgEg BY POLYMERASE CHAIN REACTION 

Analysis of the deduced protein sequences of the 
different higher plant glycerolipid desaturases 

10 described in this invention reveals to those skilled in 
the art regions of the amino acid sequences that have 
been conserved among higher plants and between higher 
plants and cyanobacterial des A. These short stretches 
of amino .acids can be used to design oligomers as 

15 primers for polymerase chain reactions. Two amino acid 
sequences that are highly conserved between the des A 
and plant delta- 15 desaturases polypeptides are amino 
acid sequences 97-108 and 299-311 (SEQ IP NO:2) . 
Polymerase chain reactions (PCRs) were performed using 

20 GeneAmp® RNA PCR Kit (Perkin Elmer Cetus) following 

manufacturer's protocols. In one PCR experiment, SEQ ID 
NOS:22 and 23 were used as sense primers and either SEQ 
ID NOS:24 and 25 or SEQ ID NOS:26 and 27 as antisense 
primers on poly A+ RNA purified from both ArabidOPSis 

25 leaf and canola developing seeds. For this, ca. 100 ng 
of polyA+ RNA was isolated as described previously and 
reverse-transcribed using the kit using random hexamers. 
Then the cDNA was used in PCR using 64 pmoles each of 
SEQ ID NOS:22 and 23 as sense primers and either a 

30 mixture of 64 pmoles of SEQ ID NO: 24 and 78 pmoles of 

SEQ ID NO: 25 or a mixture 35 pmoles of SEQ ID NO: 26 and 
50 pmoles of SEQ ID NO: 27 by the following program: a) 
1 cycle of 2 min at 95^*0 and 15 C at 50**C, b) 30 cycles 
of 3 min at 65*^0 (extension) , 1 min 20 sec at 95**C 

35 (denaturation) ^ 2 min at 50**C (annealing) , and c) 1 
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10 



cycle of 7 min at 65»C. PCR products were analyzed by 
gel electrophoresis. All PCRs resulted in PGR products 
of the correct size <ca. 630 bp) . The PCR products from 
Arabidopsis and canola were purified and used as 
radiolabeled hybridization probes to screen the Lambda 
Yes Arabidopsis cDNA library at low stringency, as 
described above. This led to the isolation of a pure 
phage, which was excised to give plasmid pracp7. The 
CDNA insert in pYacp7 was partially sequenced. Its 
sequence showed that it encoded an incomplete desaturase 
polypeptide that was identical to another cDNA (in 
plasmid pFadx-2) isolated by low-stringency 
hybridization as described previously. The cos^osite 
sequence derived from the partial sequences from the 
15 CDNA inserts in pFadx-2 and pYacpT is shown in SEQ ID 

NO:16 and the polypeptide encoded by it in SEQ ID H0:17. 
As discussed previously, SEQ ID NO: 17 is a putative 
plastid delta-15 desaturase. A full-length version of 
pYacp7 can be readily isolated using it has a 
20 hybridization probe. 

Two additional conserved regions correspond to 
aminoacid residues 130 to 137 and 249 and 256 of SEQ ID 
NO: 7 (Brassica napus glycerolipid desaturase delta-15) . 
Degenerate oligomers were designed to these regions with 
25 additional nucleotides containing a restriction site for 
Bam HI were added to the 5- ends of each oligonucleotide 
to facilitate subdoning of the PGR products. The 
nucleotide sequences of these oligonucleotides named 
F2-3 and F2-3c are shown in SEQ ID N0tl8 and SEQ ID 

30 NO: 19 respectively. 

Mixtures of degenerate oligonucleotides F2-3 and 
F2-3C were used to amplify, isolate and clone glycero- 
lipid desaturase sequences represented in corn seed mRNA 
population, essentially as described in the Geneftmp RNA 

35 PGR Kit purchased from Perkin Elmer Cetus and in Innis, 
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et al., Eds, (1990) PGR Protocols: A Guide to Methods 
and implications. Academic Press, San Diego. 

Corn seed RNA was obtained from developing corn 
seeds 15-20 days after pollination by the method of 
5 Chirgwin et al . , (1979) Biochemistry 18:5294. Corn seed 
polyadenylated mRNA was isolated by affinity 
chromatography on ollgo-dT cellulose (Aviv et al., Proc. 
Natl. Acad. Scl. USA (1972) 69:1408-1411). 20-50ng of 
A+mRNA were used in reverse transcription reactions with 

10 oligo-dT and random hexamers primers using the reaction 
buffer and conditions recomended by Perkin Elmer Cetus- 
The resulting cDNA was then used as template for the 
amplification of corn seed glycerollpid sequences using 
the set of degenerate primers in SEQ ID NO: 18 and 19. 

15 Reaction conditions were as described by Perkin Elmer 
Cetus, the amplification protocol consisted of a 
sequence of 95*^0/1 min, 55^C/1 min, 72**C/2 mln for 30-50 
cycles. The resulting polymerase reaction products were 
phenol-chloroform extracted, digested with Bam HI and 

20 separated from unincorporated primers by gel filtration 
chromatopgraphy on Linker 6 spin columns (Pharmacia 
Inc.). The resulting PCR products were cloned into 
pBluescript SK at the Bam HI site, and transformed into 
£. coli DH5 competent cells. Restriction analysis of 

25 plasmld DNA from the transformed colonies obtained 

revealed a colony, PCR-20, that contained an insert of 
about 0.5 kB in size at the pBluescript SK BamHl site. 
The PCR-20 insert was completely sequenced on both 
strands. The nucleotide sequence of PCR20 insert is 

30 shown in SEQ ID NO: 14 and the translated amino acid 
sequence is shown in SEQ- ID NO: 15. This aminoacid 
sequence shows an overall identity of 61.9% to the 
aminoacid sequence of Brassica napus microsomal delta-15 
deaturase shown in SEQ ID NO: 7. This result identifies 

35 the PCR20 insert as a polymerase reaction product of a 
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corn seed delta-15 desaturase cDNA. PCR20 insert may be 
used as a probe to readily isolate full length corn seed 
delta-15 desaturase cDNAs or as sucdi to antisense or 
cosuppress com seed glycerolipid delta-15 desaturase 
5 gene eaepression in transgenic corn plants by cloning it 
in the appropriate corn gene esqjression vector. 

USE OF THE apawTnnPSTS THftTiTflNA DELTA-15 DESATORASE 
GENOMIC CLONES AS A RESTRICTION FRAGMENT LENGTH 
10 POLYMORPHISM (RFLP) MARKERS TO MRP THE DELTA-15 

ppgaTHPASE T^rT TW ARAWTDOPSIS 

DNA flanking the T-DNA insertion site in mutant 
line 3707 was used to map the genetic locus encoding the 
delta-15 desaturase of ftrfitricloffsis tha l i ana seeds . An 

15 approximately 12 kB genomic DNA fragment containing the 
a^aHSrinpsis delta-15 desaturase coding sequence was 
removed from the lambda-4211 clone by digestion with 
restriction endonudease Xho T, separated from the 
Lambda arms by agarose gel electrophoresis^ and purified 

20 using standard procedures. The isolated DNA was labeled 
with 32P using a random priming kit from Pharmacia under 
conditions recommended by the manufacturer^ The 
radioactive DNA was used to probe a Southern blot 
containing genomic DNA from ftrflM<iOT?sig tha l i ana 

25 (ecotype Wassileskija and marker line wlOO ecotype 
Landesberg background) digested with one of several 
restriction endonucleases . Following hybridization and 
washes under standard conditions (Sambrook et al./> 
Mblecular Cloning: A Laboratory Manual, 2nd ed. <1989) 

30 Cold Spring Harbor Laboratory Press) , autoradiograms 
were obtained. Different patterns of hybridization 
(polymorphisms) were identified in digests using 
restriction endonucleases Bgl II, Cla I, Hind III, Nsi 
1, and Xba I. The same radiolabeled DNA fragment was 

35 used to map the polymorphism essentially as described by 
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Helentjaris et al., (Theor. Appl, Genet. (1986) 
72:761-769) . The radiolabeled DNA fragment was applied 
as described above to Southern blots of Xba I digested 
genomic DNA isolated from 117 recombinant inbred progeny 
5 (derived from single-seed descent lines to the Fs 

generation) resulting from a cross between ArabidQPSis 
t^haliana marker line WlOO and ecotype WassilesJci ja (Burr 
et al,^ Genetics (1988) 118:519-526). The bands on the 
autoradiograxas were interpreted as resulting from 

10 inheritance of either paternal (ecotype Wassileski ja) or 
maternal (marker line WlOO) DNA or both (a 
heterozygote) . The resulting segregation data were 
subjected to genetic analysis using the computer program 
Mapmaker (Lander et al.. Genomics (1987) 1:174-181). In 

15 conjunction with previously obtained segregation data 
for 63 anonymous RFLP markers and 9 morphological 
markers in Ar-ahidopsis thaliana (Chang et al., Proc. 
Natl. Acad. Sci. USA (1988) 85:6856-6860; Nam et al.r 
Plant Cell (1989) 1:699-705), a single genetic locus was 

20 positioned corresponding to the genomic DNA containing 
the delta-15 desaturase coding sequence. The location 
of the delta-15 desaturase gene was thus determined to 
be on chromosome 2 between the lambda AT283 and cosmid 
C6842 RFLP markers, near the and erecta morphological 

25 markers. 

The cDNA in plasmid pCM2 was also shown to 
hybridize polymorphically to genomic DNA from 
ikT-»H^rir>psis thaliana (ecotype wassileski ja and marker 
line WlOO ecotype Landesberg background) digested with 

30 Eco RI, It was used as a RFLP marker to map the genetic 
locus for the gene encoding this fatty acid desaturase 
in ^-ra>sidQpsis as described above. A single genetic 
locus was positioned corresponding to this desaturase 
cDNA. Its location was thus determined to be on 

35 chromosome 3 between the lambda AT228 and cosmid c3838 
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HFZrP laarkers^ "north" of the glabrous locus (Chang et 
al-r Proc. Natl- Acad. Sci. USA (1988) 85:6856-6860; Nam 
et al.r Plant Cell (1989) 1:699-705) . 

5 USE OF SOYBEAN SEED MICROSOMAL DELTA-15 GLYCEROLIPID 

DBSATURASE CDNA SEQUENCE IN PLASMID AS A RESTRICTION 
fPACTffETJT t^EWSTH POl^YMORPffTSM (RFLP) MARKER 
A 600 bp fragment of the cDNA insert from plasmld 
pXFlr which contains about 300 bp of the coding sequence 

10 and 300 bp of the 3' untranslated sequence, was excised 
by digestion with restriction enzyme Eco RI in standard 
conditions as described in Sambrook et al. (Molecular 
Cloning, A Laboratory Manual/ 2nd ed. (1989) Cold Spring 
Harbor Laboratory Press) , purified by agarose gel 

15 electrophoresis and labeled with ^^p using a Random 

Priming Kit from Bethesda Research Laboratories under 
conditions recommended by the manufacturer. The 
resulting radioactive probe was used to probe a Southern 
blot (Sambrook et al.. Molecular Cloning, A Laboratory 

20 Manual, 2nd ed. (1989) Cold Spring Harbor Laboratory 

Press) containing genomic DNA from soybean [glycine max 
(cultivar Bonus) and fils:cina saia (PI81762)1, digested 
with one of several restriction enzymes. After 
hybridization and washes under standard conditions 

25 (Sambrook et al- Molecular Cloning, A Laboratory Manual, 
2nd ed. (1989), Cold Spring Harbor Laboratory Press), 
autoradiograms were obtained and different patterns of 
hybridization (polymorphisms) were identified in digests 
performed with restriction enzymes Bam HI^ Eco RV and 

30 Eco RI. The same probe was then used to map the 
polymorphic pXFl locus on the soybean genome, 
essentially as described by Helentjaris et al. (Theor. 
Appl. Genet. (1986) 72:761-769). Plasmid pXFl/600 bp 
probe was applied, as described above, to Southern blots 

35 of EcoRI, PstI, EcoRV, BamHI, or Hin Dili digested 
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genomic DNAs isolated from 68 F2 progeny plants 

resulting from a £• 2002^ Bonus x fi. so-^a PI81762 cross. 

The bands on the autoradiograms were interpreted as 

resulting from the inheritance of either paternal 

5 (Bonus) or maternal (PI81762) pattern, or both (a 
« 

heterozygote) - The resulting data were subjected to 
genetic analysis using the computer program Mapmaker 
(Lander et al.. Genomics (1987) 1:174-181). In 
conjunction with previously obtained data for 436 

10 anonymous RFLP markers in soybean (Tingey et al., 
J. Cell. Biochem.r Supplement 14E (1990) p. 291, 
abstract R1531, Applicants were able to position a 
single genetic locus corresponding to the pXFl/600 bp 
probe on the soybean genetic map. This confirms that 

15 the gene for microsomal delta-15 desaturase is located 
on chromosome 19 in the soybean genome. This 
information will be useful in soybean breeding targeted 
towards developing lines with altered polyunsaturate 
levels • 

20 EXftMPLE 9 

OVEREXPRESSION OF MICROSOMAL DELTA- 15 
FATTY ACTD nF.gATTTRAfiF. TK PLANTS 

Detailed procedures for DNA manipulation, such as 
use of restriction endonucleases and other DNA modifying 

25 enzymes, agarose gel electrophoresis, isolation of DNA 
from agarose gels, transformation of CQli cells with 
plasmid DNA, and isolation and sequencing of plasmid DNA 
are described in Sambrook et al. (1989) Molecular 
cloning, A Laboratory Manual, 2nd ed. Cold Spring Harbor 

30 Laboratory Press and Ausubel et al. (1989) Current 

Protocols in Molecular Biology John Wiley & Sons- All 
restriction enzymes and modifying enzymes were obtained 
from Bethesda Research Laboratory, unless otherwise 
noted. 
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10 



To test the biological effect of overexpression of 
the microsomal delta-15 desaturase SEQ ID NO:lr i.e., 
the cDNA encoding ^^«K<H»»sts thaliana microsomal delta- 
15 desaturase, was placed in the sense orientation 
behind either the CaMV 35S promoter, to provide 
constituitive e^qpression, or behind the promoter for the 
gene encoding soybean a' subunit of the p-conglycinin 
(7S) seed storage protein, to provide embryo-specific 
eacpression. To create the chimeric gene constructs, 
specific egression cassettes were made to facilitate 
easy manipulation of the desired clones. The chimeric 
genes were then transformed into plant cells by 
j.^r*.nr^^ytrim tuinsfaclfiXia ' s binary Ti plasmid vector 
system [Hoekema et al, . (1983) Nature 303:179-180; Sevan 
15 (1984) Nucl. Acids Res. 12:8711-8720]. 

Overexpression of Arabidopsis Delta-15 Fatty Acid 
in T^a»«y>>nH n. Cnrmt, Hnirv Roots 
To confirm the identity of SEQ ID NO:l (Arabidopsis 
microsomal delta-15 fatty acid desaturase) and to test 
20 the biological effect of its overexpression in a 

heterologous plant species, the constitutive chimeric 
gene 35S:SEQ ID NO:l was introduced into carrot tissue 
by Agrobacterium. The cassette for constitutive gene 
expression in plasmid, pAW28, originated from pK35K 
25 which, in turn, is derived from pKNK. Plasmid pKNK is a 
pBR322-based vector containing a chimeric gene for plant 
kanamycin resistance: nopaline synthase (NOS) 
promoter/neomycin phosphotransferase {HPT> II coding 
regiQn/3' NOS chimeric gene. Plasmid pKNK has been 
30 deposited on 7 January 1987 with the American Type 

culture Collection of Rockville, Maryland, USA under the 
provisions of the Budapest Treaty and bears the deposit 
accession number 67284 . A map of this plasmid is shown 
in Lin, et al.. Plant Physiol. (1987) 84:856-861. The 
35 NOS promoter region is a 296 bp Sau 3A-Fst I fragment 
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corresponding to nucleotides -263 to +33, with respect 
to th transcription start sit , of the NOS gene 
described by Depicker et al- (1982) J. Appl. Genet. 
1: 561-574 • The Pst I site at the 3' end was created at 
5 the translation initiation codon of the NOS gene. The 
Nptll coding region is a 998 bp Hind III-Bam HI fragment 
obtained from transposon Tn5 (Beck et al.. Gene (1982) 
19;327-336) by the creation of Hind III and Bam HI sites 
at nucleotides 1540 and 2518, respectively. The 3* NOS 

10 is a 702 bp Bam Hl-Cla I fragment from nucleotides 848 
to 1550 of the 3' end of the NOS gene (Depicker et al., 
J. Appl. Genet. (1982) 1:561-574) including its' 
polyadenylation region. pKNK was converted to pK35K by 
replacing its Eco RI-Hind III fragment containing the 

15 NOS promoter with a Eco RI-Hind III fragment containing 
the CaMV 35S promoter. The Eco RI-Hind III 35S promoter 
fragment is the same as that contained in pUC35K that 
has been deposited on 7 January 1987 with the American 
Type Culture Collection under the provisions of the 

20 Budapest Treaty and bears the deposit accession number 
67285. The 35S promoter fragment was prepared as 
follows, and as described in Odell et al.. Nature (1985) 
313:810-813, except that the 3' end of the fragment 
includes CaMV sequences to +21 with respect to the 

25 transcription start site. A 1.15 KB Bgl II segment of 
the CaMV genome containing the region between -941 and 
+208 relative to the 35S transcription start site was 
cloned in the Bam HI site of the plasmid pUC13. This 
plasmid was linearized at the Sal I site in the 

30 polylinker located 3' to the CaMV fragment and the 3' 
end of the fragment was • shortened by digestion with 
nuclease Bal31. Following the addition of Hind III. 
linkers, the plasmid DNA was recircularized. From 
nucleotide sequence analysis of the isolated clones, a 

35 3* deletion fragment was selected with the Hind III 



wo 93/11245 



94 



PCr/US92/10284 



linker positioned at +21. The 35S promoter fragment was 
isolated as an Eco RI-Hind III fragment, the Eco RI site 
coming from the polylinker of pUC13. 

The NPTII coding region in plasmid pK35K was 
5 removed from plasmid pK35K by digestion with Hind III 
and Bam HI restriction enzymes . Following digestion, 
the ends of the DNA molecules were filled-in using 
Klenow enzyme. Not I linkers (New England Biolabs) were 
then ligated on the ends and the plasmid was 

10 recircularized to yield plasmid pKlSNt. A 1.7 icB 

fragment containing the 35S promoter region - Not I site 
- 3' untranslated region from nopaline synthase was 
liberated from pK35Nt using restriction endonucleases 
Eco RI and Cla I. Following restriction digestion the 

15 ends of the DNA molecules were filled-in using Klenow 
enzyme after which Xho I linkers (New England Biolabs) 
were added. The 1.7 kB fragment, now containing Xho I 
sites at either end, was gel isolated and cloned into 
the plasmid vector pORA3 (Clonetech) at its unique Xho I 

20 site. The vector pURA3 %»as choosen due to the absence 
of a Not I restriction site, the presence of a single 
Xho I restriction site and because the relatively large 
size of the vector (pURAS) would make the isolation of 
the gene expression cassettes relatively easy from the 

25 final construct. 

The 1.4 kB Not I fragment in plasmid pCF3 
containing Arabidopsis microsomal delta-15 desaturase 
(SEQ ID N0:1) was isolated and ligated to pAW28 (the 
constituitive expression cassette) previously linearized 

30 with Not I restriction enzyme and treated with calf 

intestinal alkaline phosphatase (Boehringer Mannheim) to 
result in plasmids pAW29 and pAW30 that had SEQ ID N0:1 
cloned in a sense orientation and antisense orientation, 
respectively, with respect to the promoter. The 

35 orientation of the cDNA relative to the promoters was 
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esteUblished by digestion with appropriate restriction 
endonucleases or by sequencing across the promotor-cDNA 
jtmctions . 

The chimeric genes 35S promotor/sense SEQ ID 
5 N0:1/3'N0S and 35S promoter /ant i sense SEQ ID N0:1/3»N0S 
were isolated as a 3 kB Xho I fragment from plasmids 
pAW29 and pAW30, respectively, and cloned into the 
binary vector pZS194b at its unique Sal I site to result 
in plasmids pAW31 and pAW32, respectively. The 

10 orientation of the plant selectable marker gene in pAW31 
and pAW32 is the same as that of the 35S promoter as 
acertained by digestion with appropriate restriction 
endonucleases. Binary vector pZS194b contains the 
pBR322 origin of replication, the replication and 

15 stability regions of the pg^udomonas aeruginosa plasmid 
pVSl [Itoh, et al., (1984) Plasmid 11:206-220] required 
for replication and maintenance of the plasmid in 
Agrobacterlum . the bacterial NPT II gene (kanamycin 
resistance) from Tn5 [Berg et al., (1975) Proc. Nat'l- 

20 Acad. Sci. U.S.A. 72:3628-3632] as a selectable marker 
for transformed bacteria, left and right borders of the 
T-DNA of the Ti plasmid [Sevan et al., (1984) Nucl. 
Acids Res. 12:8711-8720], and, between the left and 
right T-DNA borders are the chimeric NOS:NPT II gene for 

25 plant kanamycin resistance, described above, as a 

selectable marker for transformed plant cells and the E. 
coli lacZ a-conqplementing segment [Vieria and Messing 
(1982) Gene 19:259-267] with unique restriction 
endonuclease sites for Kpn I and Sal I. 

30 The binary vectors pAW31 and pAW32 were transformed 

by the freeze/thaw method [Holsters et al. (1978) Mol. 
Gen. Genet. 163:181-187] into ftgrobacterlum tiimftfaciens 
strain RlOOO, carrying the Ri plasmid pRiA4b from 
Ai^rnbacterlum rhtzooenes [Moore et al., (1979) Plasmid 
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2:617-6261 to result in transformants RLOOO/pAWSl and 
RlO0O/pAW32 , respectively . 

Carrot manms fnr-r^f^ L. ) cells Were transformed by 
co-cultivation of carrot root disks with strain RIOOO, 
5 R1000/PAW31, or R1000/pAW32 by the method of Petit et 
al., (1986) Mol. Gen. Genet. 202 :388-i393] . To prepare 
explants for inoculation, carrots purchased from the 
local supermarket were first scrubbed gently with water 
and dish detergent, then rinsed thoroughly with tap and 

10 distilled water. They were surface sterilized in a 

stirred solution of 50% Clorox and distilled water for 
30 min and rinsed thoroughly with sterile distilled 
water. The carrots were peeled using an autoclaved 
vegetable peeler and then sliced with a scalpel blade 

15 into disks of approximately 5-10 mm thickness . the 
disks were placed in petri dishes, onto a medium 
consisting of distilled deionized water solidified with 
0.7% agar, in an inverted orientation so that the cut 
surface nearest to the root apex of the carrot was 

20 e^^osed for inoculation. 

Cultures of Piflr">'"'^*^^^""» strains RIOOO, 
R1000/PAW31, and R1000/pAW32 were initiated from freshly 
grown plates in LB broth plus the appropriate antibiotic 
selective agents (50 mg/L chloramphenicol for the RlOOO 

25 or 50 mg/L each of chloramphenicol and kanamycin for 
R1000/PAW31 and R1000/pAW32) and grown at 28*C to an 
optical density of around 1.0 at 600 nm. Bacterial 
cells were pelleted by cent rlfugat ion, rinsed and 
resuspended in LB broth without antibiotics. Freshly 

30 cut carrot disks were inoculated by applying 100 |1L of 
the bacterial suspension to the cut surface of each 
disk. AS a control, some disks were inoculated with 
sterile LB broth only, to indicate the extent of root 
formation in the absence of flgronnnteriwm. 
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Inoculated root disks were incubated at: 25^C in the 
dark in petri dishes sealed with Parafilm. After two 
weeks of co-cultivation of carrot disks with 
Agr oba et e r ium ^ the carrot disks were transferred to 
5 fresh agar-solidif ied water medium containing 500 mg/L 
carbenicillin for the counterselection of Agrobacterium. 
At this time, hairy root formation was noted on some 
root disks. Transfer of the explants to fresh 
counterselection medium was done at four weeks. 

10 Excision of individual roots from the explants was begun 
at six weeks. Ten days later, additional roots were 
taken from the explants as needed. 

Approximately 5-10 mm long hairy roots were excised 
and individually subcultured on MS minimal organics 

15 medium with 30 g/L sucrose (Gibco, Grand Island, N. Y., 
Cat. No. 510-1118EA) and 500 mg/L carbenicillin. 
Approximately equal numbers of roots were subcultured in 
liquid medium and in a medium solidified with 0.6% 
agarose. Cultures on solid medium were grown in 60 x 

20 100 mm petri dishes, liquid cultures were in 6-well 

culture dishes. When excising roots, an effort was made 
to select single roots from distinct callus-like 
outgrowths on the wounded surface . These sites of 
excision were marked on the lid of the petri dish to 

25 minimize repeat sait?>ling of tissue originating from the 
same transformation event. 

Two to three weeks after excision from the 
explants, individual hairy root cultures that were not 
visibly contaminated with Agrmbacterium were transferred 

30 to fresh MS medium supplemented with 500 mg/L 

carbenicillin. The root -mass of each culture was cut 
into segments including one or more branch roots, and 
these segments were transferred as a group to a plate or 
well of fresh medium. Approximately 20 mg fresh weight 

35 of tissue of root cultures which grew to adequate size 
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witbin the next two to three weeks were san^led for 
fatty add coinposi.ti.on by gas diroraatograpby of the 
fatty acyl methyl esters essentially as described by 
Browse et al., (Anal. Biochem. (1986) 152:141-145) 
5 except that 2.5% H2SO4 in methanol was used as the 

methylation reagent and samples were heated for 1.5 h 
80**C to effect the methanolysis of the seed 
triglycerides. Vhie results are shown in Table 6. A 
second san«>le of tissue consisting of an actively 

10 growing root tip of approximately 1 cm was excised an< 
placed on MS medium supplemented with 500 mg/L 
carbenicillin and 25-50 mg/L kanamycin to test for 
kanamycin resistance select for hairy roots co- 
transformed with the binary vector [Simpson et al. 

15 (1986) Plant Mol. Biol. 6:403-4153. 
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Psrcent 18:3 and 18:2/18:3 Ratio in 
Pnn»« n* Tfanatyanig Carrots 



Sample 


Txansfoxination 

Vector Used 






1 


Rl000/pAW31 


62 


0.09 


2 


R1000/pAW31 


8 


7.30 






10 


5.69 


4 


RlOOO/nAWBl 


62 


0.06 


w 


RIOOO/dAWSI 


10 


5.07 


5 


RlOOO/pAWSl 


4 


14.2 


7 


RIOOO/dAHBI 


61 


0.18 


o 
o 




4 


15.1 


Q 


RIOOO /nAW31 


61 


0. 07 


1 n 


PI QOO /nAW31 

<VX U V V / ^«*wi w X 


63 


0. 09 


1 1 

XX 


RIOOO /iaAH31 


15 


3.04 


12 


RXOuU/pAWJl 


OH 




13 


EaOOO/pAW31 


5 


9.94 


14 


RlOOO/pAHBl 


9 


6.72 


15 


RlOOO/pAWSl 


8 


7.08 


IS 


R1000/pAW31 


8 


6.31 


17 


R1000/pM731 


23 


1.86 


18 


R1000/pAW31 


8 


7.33 


19 


R1000/pAW31 


10 


5.99 


20 


R1000/pAW31 


7 


8.83 


21 


RIOOO /pAW32 


9 


6.80 
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Root Sainole 




f-l ft? 3 




22 


R1000/PAW32 


4 


11.8 


23 


R1000/PAW32 


3 


18.8 


24 


Rl0O0/paW32 


10 


6.21 


25 


R1000/FAW32 


7 


8.S7 


26 


R1000/PAW32 


3 


16.4 


27 


RlO00/p2^W32 


6 


8.29 


28 


R1000/PAW32 


5 


9.19 


29 


R1000/pAW32 


5 


8.47 


30 


R1000/pAW32 


8 


7.17 


31 


Rl000/pAW32 


4 


11.9 


32 


Rl000/pAW32 


8 


7.20 


33 


R1000/pAW32 


5 


10.4 


34 


R1000/PAW32 


8 


7.29 


35 


R1000/pAW32 


3 


17.2 


36 


R1000/PAH32 


8 


7.27 


37 


R1000/pAW32 


9 


6.01 


38 


R1000/PAW32 


9 


6.62 


40 


R1000/PAW32 


9 


6.02 


41 


RIOOO 


8 


7.23 


42 


RIOOO 


8 


7.83 


43 


RIOOO 


10 


6.20 


44 


RIOOO 


9 


5.97 


45 


RIOOO 


9 


6.73 


46 


RIOOO 


9 


6.27 


47 


RIOOO 


8 


7.27 


48 


RIOOO 


7 


8.30 


49 


RIOOO 


9 


7.11 



The ability of RIOQO transformed "hairy" roots to 
grow in the absence of exogenous phytohormones can be 
attributed to the Ri plasmid, pRiA4b. When R1000/pAW31 
5 or R1000/PAW32 strains are used to transform, only a 
fraction (about half) of the "hairy" roots wiU also be 
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transformed with the experimental binary vector, pAW31 
or pAW32. l*hus/ as expected, hot all hairy roots 
resulting from transformation with R1000/pAW31 show the 
high 18:3 phenotype. The absense of any significant 
5 fatty acid phenotype in "hairy roots" transformed with 
R1000/pAW31 is expected, since carrot and Arahidopsis 
delta-15 desaturase sequences are not expected to be 
sufficiently related. These results show that 
overexpression of Ayahldopsis microsomal delta-15 

10 desaturase can result in over 10-fold increase in 18:3 
at the expense of 18:2 in heterologous plant tissue. 
Overexpression of Ai-abidopsis Delta-15 Fatty Acid 

Desaturase in Seeds and Complementation of the 
Mut^ation in Delt a^l.^ -Pesaturat ion in Mutant 3707 

15 To complement the delta-15 desaturation mutation in 

the T-DNA mutant 3707 and to test the biological effect 
of overexpression of SEQ ID N0:1 /Ayahidopsis microsomal 
delta-15 fatty acid desaturase) in seed, the embryo- 
specific promoter: SEQ ID N0:1 chimeric gene was 

20 transformed into the mutant plant- This embryo-specific 
expression cassette in pAW42 was produced, in part, 
using a modified version of vector pCW109, Vector 
pCW109 itself was made by inserting into the Hind III 
site of the cloning vector pUClS (Bethesda Research 

25 Laboratory) a 555 bp 5" non-coding region (containing 
the promoter region) of the p-conclycinin gene followed 
by the multiple cloning sequence containing the 
restriction endonuclease sites for Nco I, Sma I, Kpn I 
and Xba I, then 1174 bp of the common bean phaseolin 3' 

30 untranslated region into the Hind III site [Slightom et 
al., Proc. Nafl Acad.-Sci. U.S.A. (1983) 80:1897-1901]. 
The P-conclycinin promoter region used is an allele of 
the published P-conglycinin gene (Doyle et al., J. Biol. 
Chem. (1986) 261:9228-9238) due to differences at 27 
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nucl otide positions. Furtlier sequence description may 
be found in Slightom (W091/13993) . 

The modifications to vector pCW109 were as follows: 
The potential translation start site was destroyed by 
5 digestion with Nco I and Xba I restriction enzymes 
followed by treatment with mung bean nuclease (New 
England Biolabs) to create linear, blunt ended DNA 
molecules. After ligation of Not I linkers (New England 
Biolabs) and digestion with Not I restriction enzyme 

10 (New England Biolabs) the plasmid was recircularized. 
Confirmation of the desired change was obtained by 
dideoxy sequencing. The resulting plasmid was 
designated pAW35. The 1.8 kB Hind III fragment from 
pAW35 containing the modified P-conclycinin promotor/3 ' 

15 phaseolin region was subcloned into the Hind III site in 
plasmid vector pBluescript SK+ (Stratagene) creating 
plasmid PAW36. Plasmid pAW36 was linerized at its 
unique Eco RI site and ligated to Bco Rl/Xho I adaptors 
(Stratagene). Following digestion with Xho I, the 1.7 

20 kB Xho I fragment containing the p-conclycinin 

promotor/Not I site/3 '-phaseolin untranslated region was 
cloned into the Xho I site in pORAS vector (Clonetech) . 
The resultant plasmid, pAW42, contains the seed specific 
ea^pression cassette bordered by Xho I sites to 

25 facilitate cloning into modified T-DNA binary vectors 
and a unique Not I site to facilitate cloning of tajrget 
cDNA sequences. Vector pURAS was choosen due to the 
absence of a Not I restriction site, the presence of a 
single Xho I restriction site, and the relatively large 

30 size of the vector (pORA3) would make the isolation of 
the gene expression cassettes relatively easy from the 
final construct. 

The 1.4 kB Not I fragment in plasmid pCF3 
containing Arabidopsis microsomal delta-15 desaturase 

35 (SEQ ID NO:l) was isolated and ligated to plasmid pAW42 
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(the seed-specific expression cassette) that had 
pr viously been linearized with Not I restriction enzyme 
and treated with calf intestinal alkaline phosphatase 
(Boehringer Mannheim) to result in plasmids pAW45 that 
5 had SEQ ID N0:1 cloned in a sense orientation with 

respect to the promoter. The orientation of the cDNA 
relative to the promoters was established by digestion 
with appropriate restriction endonucleases or by 
sequencing across the promotor-cDNA junctions . 
10 The chimeric p-conclycinin promotor/sense SEQ ID 

NO:l/phaseolin 3' was isolated as a 3-2 kB Xho I 
fragment from plasmid pAW45 and subcloned into the 
binary vector pAW25 at its unique Sal I site. In the 
resulting vector, pAWSO, the orientation of the plant 

15 selectable marker is the same as that of the 

P-conclycinin promoter .as acertained by digestion with 
appropriate restriction .endonucleases . Plasmid pAW25, 
is derived from plasmids p2S94K and pML2 . Plasmid 
PZS94K contains the pBR322 origin of replication^ the 

20 replication and stability regions of the Pseydomonas 
a^rnainosa plasmid pVSl [Itoh, et al . , (1984). Plasmid 
11:206-220] required for replication and maintenance of 
the plasmid in Acpi-nHani^erium. the bacterial NPT II gene 
(kanamycin resistance) from Tn5 [Berg et al., (1975) 

25 Proc. Nat'l. Acad. Sci. U.S.A. 72:3628-3632] as a 

selectable marker for transformed bacteria, a T-DNA left 
border fragment of the octopine Ti plasmid pTiA6 and 
right border fragment derived from TiAchS describe by 
van den Elzen et al. (Plant Mol. Biol. (1985) 

30 5:149-154). Between these borders are the £. CO li lacZ 
a-complementing segment- [Vieria and Messing (1982) Gene 
19:259-267] with restriction endonuclease sites Sal I 
and Asp 718 derived from pUC18. A 4.5 kB Asp 718-Sal I 
DNA* fragment containing the chimeric herbicide 

35 sulfonylurea (SU) -resistant acetolactate (ALS) gene was 
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10 



dbtaixied from plasmid pML2 and cloned into the Asp 
718-Sal I sites of plasmid pZS94K. This chimeric ALS 
gene contained the CaMV 35S promoter/Cab221. Bgl II-Nco I 
fragment that is described by Harpster et al., [Mol. 
Gen. Genet. (1988) 212:182-190] and the ftrfiMdops i s ALS 
coding and 3' non-coding sequences [Mazur et al., (1987) 
Plant Physiol. 85:1110-1117] that was mutated so. that it 
encodes a su-resistant form of ALS. The mutation, 
introduced by site-directed mutagenesis, are those 
present in the tobacco SU-resistant Hra gene described 
by Lee et al., (1988) EMBO J. 5:1241-1248. The 
resulting plasmid was designated pAW25. 

The binary vector pAW25 containing the chimeric 
embryo-specific p-conglycinin promoter : sense SEQ ID NO:l 
15 gene was transformed by the freere/thaw method [Holsters 
et al., (1978) Mol. Gen. Genet. 163:181-187] into the 
avirulent Agrobacterium strain LBA4404/pAL4404 [Hoekema 
et al., (1983) Nature 303:179-180) ; 

y^ T.pK<rf»»«j« root cultures were transformed by co- 
20 cultivation with fiff^»>>«^^-^^i™" using standard aseptic 
techniques for the manipulation of sterile media and 
axenic plant /bacterial cultures were followed, including 
the use of a laminar flow hood for all transfers. 
Compositions of the culture media are listed in Table 8. 
25 Unless otherwise indicated, 25x100 mm petri plates were 
used for plant tissue cultures. Incubation of plant 
tissue cultures was at 23"»C under constant illumination 
with mixed fluorescent and *6ro and Sho" plant lights 
(General Electric) unless otherwise noted. To initiate 
30 ift ^^^^'^ root cultures of the T-DNA homogyzous mutant 
line 3707 ftiaUana (L.) Heynh, geographic 

race Wassilewshi ja) seeds of the mutant line were 
sterilized for 10 min in a solution of 50% Chlorox with 
0.1% SDS, rinsed 3 to 5 times with sterile dHaO, dried 
35 thoroughly on sterile filter paper, and then 2-3 seeds 
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were sown in liquid B5 medium in 250 mL Belco flasks. 
The flasks were capped, placed on a rotary shaker at 
70-80 rpm, and incxibated for 3-4 weeks. Prior to 
inoculation with AgxQbactfilJjini/ root tissues were 
5 cultured on callus induction medium (MSKig) . Roots were 
harvested by removing the root mass from the Belco 
flask, placing it in a petri dish, and, using forceps, 
pulling small bundles of roots from the root mass and 
placing them on MSKig medium. Petri dishes were sealed 

10 with filter tape and incubated for four days. 

aj^r-oHafff erium strain LBA4404 carrying the plasmids 
pAL4404 and pAWSO were grown in 5 mL of YEB broth 
containing 25 mg/L kanamycin and 100 mg/L rifampicin. 
The culture was grown for approximately 17-20 h in glass 

15 culture tubes in a New Brunswick platform shaker (225 
rpm) maintained at 28*'C. Pre-cultured roots were cut 
into 0.5 cm segments and placed in a 100 ^m filter, made 
from a Tri-Pour beaker (VWR Scientific, San Francisco, 
CA USA) and wire mesh, which is set in a petri dish. 

20 Root segments were inoculated for several min in 30-50 
mL of a 1:20 dilution of the overnight ftgrobacter i um 
culture with periodic gentle mixing. Inoculated roots 
were transferred to sterile filter paper to draw off 
most of the liquid. Small bundles of roots, consisting 

25 of several root segments, were placed on MSKig medium 
containing 100 \m acetosyringone (3',5'-Dimethoxy-4 
hydroxyaceto-phenone, Aldrich Chemical Co., Milwaukee, 
WI, USA) . Petri plates were sealed with paraf ilm or 
filter tape and incubated for 2 to 3 days. 

30 After infection, root segments were rinsed and 

transferred to shoot induction medium with antibiotics. 
Root bundles were placed in a 100 ^m filter unit 
(described above) and rinsed with 30-50 mL liquid MSKig 
medium. The filter was vigorously shaken in the 

35 solution to help remove the ftgrobflCterimrii transferred 
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to a clean petri dish, and rinsed again. Roots were 
blotted on sterile filter paper and bundles of roots 
were placed on MSg medium containing 500 mg/1 vancomycin 
and either 10 or 20 ppb chlorsulf uron . Plates were 
5 sealed with filter tape and incubated for 12 to 14 days. 
Green nodules and small shoot primordia were 
visible at about 2-3 weeks. The eacplants were either 
left intact or were broken into numerous pieces and 
placed on GM medium containing 200-300 mg/L vancomycin 

10 and either 10 or 20 ppb chlorsulfuron for further shoot 
development. Plates were either sealed with two pieces 
of tape or with filter tape. As they developed, 
individual shoots were isolated from the callus and were 
placed on MSRg medium containing 100 mg/L vancomycin and 

15 either 10 or 20 ppb chlorsulfuron. Dishes were sealed 
as described above and incubated for seven to 10 days. 
Shoots were then transferred to GM medium containing 
100-200 mg/L vancon^cin in 25x100 petri dishes or 
Magenta G7 vessels. Many primary transformants (TIJ 

20 which were transferred to individual containers set seed 
(T2) . 

T2 seed was harvested from selected putative 
transformants and sown on GM medium containing lOppb 
chlorsulfuron. Plates were sealed with filter tape, 

25 cold treated for 2 or more days at 4"*C, and then 
incubated for 10 to 20 days at 23'»C under constant 
illumination as described above. SeedUngs were scored 
as resistant (green, true leaves develop> and sensitive 
(no true leaves develop) . 

30 Selected chlorsulfuron resistant T2 seedlings were 

transplanted to soil and were grown to maturity at 23»C 
daytime (16 h) IS'C nighttime (8 h) at 65-80% relative 
humidity . 

T2 seeds from two plants were harvested at maturity 
35 and analysed individually for fatty acid composition by 
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gas chromatography of the fatty acyl methyl esters 
essentially as described by Browse et al., (Anal. 
Biochem- (1986) 152:141-145) except that 2.5% H2SO4 in 
methanol was used as the methylation reagent and samples 
5 were heated for 1.5 h at 80**C to effect the methanolysis 
of the seed triglycerides. The results are shown in 
Table 7. 



TftBLE 7 

Percent Fatty Acid in Seeds of 
Transgenic Mutant 3707 



Seed paTnpl,e 


16*0 


18:0 






18*3 


wildtype(6) 


6 


4 


14 


30 


19 


mutant 3707(6) 


6 


4 


14 


44 


3 


1-1 


10 


4 


22 


9 


55 


1-2 


11 


6 


22 


14 


48 


1-3 


12 


7 


16 


6 


57 


1-4 


10 


4 


30 


52 


4 


1-5 


10 


4 


18 


17 


48 


1-6 


10 


5 


15 


15 


53 


2-1 


11 


5 


19 


60 


4 


2-2 


10 


5 


19 


9 


56 


2-3 


9 


4 


27 


8 


52 


2-4 


10 


5 


17 


10 


56 


2-5 


10 


5 


19 


9 


56 


2-6 


10 


5 


17 


17 


48 


The fatty 


acid composition 


Of the 


wild-type 


and 



mutant line 3707 represents the average of 6 single 
10 seeds each. Seeds from plant 1 are designated 1-1 to 
1-6 and those from plant 2 are designated 2-1 to 2-6. 
The 20:1 and 20:2 amounts are not shown. The data shows 
that the one out of six seeds in each plant show the 
mutant fatty acid phenotype^ while the remaining seeds 
15 show more than 10-fold increase in 18:3 to ca.55%. 
While most of the increase occurs at the expense of 
18:2, some of it also occurs at the expense of 18:1. 
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Sucli liigh levels are of linolenic acid in vegetable oils 
are observed in specialty oil crops, such as linseed. 
Thus, overexpression of this gene in other oilscrops, 
especially canola, which is a close relative of 
lii^^Hirfnpfiic, . is also expected to result in such high 

levels of 18:3. 

Medium Composition 

BASIC MEDIUM 



YEP MEDIUM 
Bacto Beef Eactract 5.0 g 
Bacto Yeas^ Exuact 1.0 g 
Peptone g 
Sucrose 5.0 g 

MgS04*7H20 0.5 g 

Agar (optional) 15.0 g 

PH 

VITAMIN SUPPLEMENT 
10 mg/L thiasiine 
50 ng/I* pyridoxine 
50 ng/L nicotinic acid 

MSKlg » Callus Induction Medium 

Basic MSedium 
2% glucose 20 g/L 

0.5 mg/Ir 2,4-D 2.3 HL 

0,3 mg/L Kinetin 1.4 JIM 

5 mg/L lAA |1M 

MSRg » Shoot Induction Medium 

Basic Itedium 
2% glucose 20 g/L 

12 mg/L IBA 58.8 MM 

0.1 mg/L Kinetin 0.46-MM 



1 Pkg. Murashige and Slcoog 
Minimal Organics Medium without 
Sucrose (Gibco *510-3118 or 
Sigma «MS89S) 

10 mL Vitamin Supplement 

0.05% MES 0.5 g/L 

0.8% agar 8 g/L 

pH 

QH « Germination Medium 
Basic Ifedium 
1% sucrose 10 

MSg Shoot induction Medium 

Basic Medium 
2% glucose 20 g/L 

0.15 mg/L lAA 0.fl6 MM 

5.0 mg/L 2iP 24.6 |IM 
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EXAMPLE IQ 

Construction of Vectors for Transformation 

of Brasslca napus for Reduced Expression 
of n<:^ii-a-i5 Desaturases in DevelQPinQ Seeds 
5 Detailed procedures for manipulation of DNA 

fragments by restriction endonuclease digestion, size 
separation by agarose gel electrophoresis, isolation of 
DNA fragments from agarose gels, ligation of DNA 
fragments, modification of cut ends of DNA fragments and 
10 transformation of £. coli cells with circular DNA 
plasmids are all described in Sambrook et al., 
(Molecular Cloning, A Laboratory Manual, 2nd ed (1989) 
Cold Spring Harbor Laboratory Press) and Ausubel et al.. 
Current Protocols in Molecular Biology (1989) John Wiley 

15 & Sons) . 

Sequences of the cDNA"s encoding the E. napus 
cytoplasmic delta-15 desaturase and the Brass ica napus 
plastid delta-15 desaturase were placed in the antisense 
orientation behind the promoter region from the a' 

20 subunit of the soybean storage protein p-conglycinin to 
provide embryo specific expression and high expression 
levels . 

An embryo- specif ic expression cassette was 
constructed to serve as the basis for chimeric gene 

25 constructs for anti-sense expression of the nucleotide 
sequences of delta-15 desaturase cDNAs. The vector 
pCW109 was produced by the insertion of 555 base pairs 
of the P-conglycinin (a* subunit of the 7s seed storage 
protein) promoter from soybean (Glycine ms^) , the 

30 p-conglycinin 5' untranslated region followed by a 

multiple cloning sequence containing the restriction 
endonuclease sites for Nco I, Sma I, Kpn I and Xba I, 
then 1174 base pairs of the common bean phaseolin 3* 
untranslated region into the Hind III site in the 

35 cloning vector pUCl8 (BRL) . The ^-conglycinin promoter 
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10 



sequence represents an allele of tbe published 
p-conglycinin gene (Doyle et al., (1985) J. Biol. Chem. 
261:9228-9238) due to differences at 27 nucleotide 
positions . Further sequence description may be found in 
Slightom (W091/13993) . The sequence of the 3 ' 
untranslated region of phaseolin is described in 
(Slightom et al., (1983) Proc. Watl. Acad. Sci. OSA, 

80:1897-1901). 

To facilitate use in antisense constructions, the 
Nco I site and potential translation start site in the 
plasmid PCW109 was destroyed by digestion with Nco I, 
mung bean exonudease digestion and re-ligation of the 
blunt site to give the modified plasmid pCW109A. 
PCW109A was opened between the p-conglycinin promoter 
15 sequence and the phaseolin 3- sequence by digestion with 
Sma I to allow insertion of blunt ended cDNA fragments 
encoding the delta-15 desaturase sequences by ligation. 
The blunt ended fragment of the cytoplasmic delta-15 
desaturase was obtained from plasmid pBMSF3, which 
20 contains the nucleotides 208 to . 1336 of the cDNA insert 
described in SEQ ID NO: 6. pBNSF3 was modified to remove 
the Hind III site at bases 682 to 687 of SEQ ID 6 by 
digesting with Hind III, blunting with Klenow and re- 
ligating. The resulting plasmid [pBNSF3 (-H) ] , was 
25 digested with Eco RI and Xho I to release the delta-15 
CDNA fragment, all ends were Klenow blunted and the 1.2 
JcB coding region was purified by gel isolation. The 1.2 
kB fragment was ligated into the Sma I cut pCW109A 
described above. The antisense orientation of the 
inserted cDNA relative to the p-conglycinin promoter was 
established by digestion with Aat I which cuts in the 
delta-15 desaturase coding region and in the vector 5* 
to the P-conglycinin promoter to release a 1.4 Kb 
fragment when the coding region is in the antisense 
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orientation. The antisense construction was given the 
name pCCFdRl . 

The transcription unit [P-conglycinin 

promoter : antisense delta-15 desaturase :phaseolin 3 'end] 
5 was released from pCCFdRl by Hind III digestion, 

isolated, and ligated into pBluescript which had also 
been Hind III digested to give plasmid pCCFdR2 . This 
construct has unique BamH I and Sal I sites which were 
digested. The 3 kB transcriptional unit was isolated 

10 and cloned into the Bam HI and Sal I sites in pZ199 

described below to give the binary vector p2CC3FdR. The 
orientation given by this directional cloning is with 
transcription of both the selectable marker gene and the 
delta-15 antisense gene in the same direction and toward 

15 the right border tDNA sequence* 

An antisense construction based on the plastid 
delta-15 desaturase was made with the 425 most 3' bases 
of SEQ ID NO: 8 which is contained in the plasmid pBNSFD- 
8. pBNSFD-8 represents a cDNA of the plastid delta-15 

20 desaturase in pBluescript. The cDNA insert was removed 
from pBNSFD-8 by digestion with Xho I and Sma I, the 
fragments were blunted, and the 425 base insert isolated 
by gel purification. The isolated fragment was cloned 
into the Sma I site of pCW109A and the antisense 

25 orientation of the chosen clone confirmed by digestion 
of the plasmid with Fst I. Pst I cuts in the plastid 
delta-15 sequence and in the pCW109A vector 5' to the 
p-conglycinin promoter to release a 1.2 kB fragment 
indicative of the antisense orientation. The plasmid 

30 containing this construction was called pCCdFdRl. 

Digestion of pCCdFdRl with Hind III removes a 2.3 
kB fragment containing the transcriptional unit 
[p-conglycinin promter: plastid delta-15 antisense:3 '- 

phaseolin sequence] . The fragment was gel isolated and 
35 cloned into Hind III digested pBluescript. The 
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orientation of the fragment was relative to the Bam HI 
site in the cloning region of pBluescript was determined 
by digestion with Pst I as described above. A clone 
oriented with the promoter toward the Sal I containing 
5 end was chosen and given the name pCCdFdR2. 

pCCdFdR2 was digested with Bam HI and Sal the 
released fragment was gel isolated and ligated into 
pZ199 which had been digested with Bam HI and Sal I to 
give the binary vector pZCCdFdR. 

10 Vectors for transformation of the antisense 

<ielta-15 desaturase constructions under control of the 
P-conglycinin promoter into plants using ftgrohacterium 
i-MTng>faeiens were produced by constructing a binary Ti 
plasmid vector system (Bevan, (1984) Nucl- Acids Res. 

15 12:8711-8720) . The starting vector used for these 

systems (pZS199) is based on a vector which contains: 
(1) the chimeric gene nopaline synthase/neomycin 
phosphotransferase as a selectable marker for 
transformed plant cells (Bevan et bX., (1984) Nature 

20 304:184-186), (2) the left and right borders of the 
T-DNA of the Ti plasmid (Bevan et al., (1984) Nucl. 
Acids Res. 12:8711-8720) r (3) the £. qqIL lacZ 
a-complementing segment (Vieria and Messing (1982) Gene 
19:259-267) with unique restriction endonuclease sites 

25 for ECO RI, Kpn I, Bam HI, Hin Dili, and Sal I, (4) the 
bacterial replication origin from the PseudQinQnas 
plasmid pVSl (Itoh et al., (1984) Plasmid 11:206-220), 
and (5) the bacterial neomycin phosphotransferase gene 
from Tn5 (Berg et al., (1975) Proc. Natnl, Acad. Sci. 

30 U.S.A. 72:3628-3632) as a selectable marker for 

transformed A. tumafacifiXia. The nopaline synthase 
promoter in the plant selectable marker was replaced by 
the 35S promoter (Odell et al. (1985) Nature, 
313:810-813) by a standard restriction endonuclease 

35 digestion and ligation strategy. The 35S promoter is 
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required for efficient Brass 4 n^xma transformation as 
described below. 

EXAMPLE 11 
AGRQBAgTERTOM MEDIATED TRANSFORMATION 
5 OF BRASS TC A NAPOS 

The binary vectors pZCC3FdR abd pZCCdFdR were 
transferred by a freeze/thaw method (Holsters et al • , 
(1978) Mol Gen Genet 163:181-187) to the Aarobacterium 
strain LBA4404/pAL4404 (Hoekema et al., (1983), Nature 

10 303:179-180) . 

Braas:fea napus cultlvar "Westar" was transformed by 
co-cultivation of seedling pieces with disarmed 
Aaro^acterium tumef aeiens strain LBA4404 carrying the 
the appropriate binary vector. 

15 B. na pus seeds were sterilized by stirring in 10% 

Chlorox, 0.1% SDS for thirty min, and then rinsed 
thoroughly with sterile distilled water. The seeds were 
germinated on sterile medium containing 30 mM CaCl2 and 
1.5% agar, and grown for six days in the dark at 24^C. 

20 Liquid cultures of Aorobac terium for plant 

transformation were grown overnight at 28^C in Minimal A 
medium containing 100 mg/L kanamycin. The bacterial 
cells are pelleted by centrifugation and resuspended at 
a concentration of 10® cells/mL in liquid Murashige and 

25 Skoog Minimal Organic medium containing 100 ^M 
acetosyrlngone • 

fi* na pus seedling hypocotyls were cut into 5 mm 
segments which were immediately placed into the 
bacterial suspension. After 30 min, the hypocotyl 

30 pieces were removed from the bacterial suspension and 
placed onto BC-12 callus medium containing 100 ^M 
acetosyrlngone. The plant tissue and Agrobacteria were 
co-cultivated for three days at 24**C in dim light. 

The co-cultivation was terminated by transferring 

35 the hypocotyl pieces to BC-12 callus medium containing 
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200 xug/1* carbenldllin to kill the ^ffi-«^>>«r»i-«»i-ta. and 
25 mg/L kanamycin to select for transformed plant cell 
growth. The seedling pieces were incubated on this 
medium for three weeks at 24'C under continuous light. 
5 After three weeks, the segments wre transferred to 

BS-48 regeneration medium containing 200 mg/L 
carbenlcillin and 25 mg/L kanamycin. Plant tissue was 
STabcultured every two weeks onto fresh selective 
regeneration medium, under the same culture conditions 

10 described for the callus mediiim. Putatlvely transformed 
calll grow rapidly on regeneration medium; as calli 
reached a diameter of about 2 mm, they were removed from 
the hypocotyl pieces and placed on the same medium 
lacking kanamycin. 

15 Shoots began to appear within several weeks after 

transfer to BS-48 regeneration medium. As soon as the 
shoots formed discemable stems, they were excised from 
the calll, transferred to MSV-IA elongation medium, and 
moved to a 16:8 h day /night photoperlod at 24*C» 

20 Once shoots had elongated several internodes, they 

were cut above the agar surface and the cut ends were 
dipped in Rootone. Treated shoots were planted directly 
into wet Metro-Mix 350 solless potting medium. The pots 
were covered with plastic bags which were removed when 

25 the plants were clearly growing ^- after about 10 days. 

Plants were grown under a 16:8 h day/night photo- 
period, with a daytime temperature of 23*C and a 
nighttime temperature of 17»C- When the primary 
flowering stem began to elongate, it was covered with a 

30 mesh pollen-containment bag to prevent outcrossing. 

Self-pollination was facilitated by shaking the plants 
several times each day. Seeds derived from self- 
pollinations were harvested about three months after 
planting. 
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TABLE 9 



Minimal A Bacterial Gr wth Medium 

Dissolve in distilled water: 

10.5 g potassium phosphate # 
dibasic 

* 4.5 9 potassium phosphate, 
monobasic 

1.0 9 ammonium sulfate 

0.5 9 sodium citrate, dihydrate 

Make up to 979 mLs with 
distilled water 

Autoclave 

Add 20 mLs filter-sterilized 
10% sucrose 

Add 1 mL filter-sterilized 
1 M M9SO4 

Brassica Regeneration Medim BS-48 

Murashige and Skoog Mi n i m al 

Organic Medium Gamborg B5 Vitamins 

<SI(aSA #1019) 

10 g glucose 

250 mg xylose 

600 mg MES 

0.4% agarose 

pH 5.7 

Filter-sterilize and add after 
autoclaving: 

2.0 mg/L zeatln 

0.1 mg/L lAA 

RyAMPLE 12 

AWAT.VfiTS QP TPAWfifiEWTr BRASSICA NAPUS PLANTS 

Insertion of the intact antisense transcriptional 
unit was verified by Southern analysis using transgenic 
plant leaf tissue as the source of DNA as described in 
Example 5. Ten micrograms of leaf DNA was digested to 
completion with a mixture of Bam HI and Sal I 



Brassica Callus Medium BC-12 
Per liter: 

Murashige and Skoog Minimal 
Organic Medium (MS salts, 100 
mg/L i-inositol/ 0.4 mg/L 
thiamine; 6IBC0 «510-31ie) 
30 sucrose 
18 g mannitol 
1.0 mg/L 
3.0 mg/L kinetin 
0.6% agarose 
pH 5.8 

Brassica Shoot Elongation 
Medium MSV-IA 

Murashige and Skoog Minimal 
Organic Medium Gaxnborg B5 
Vitamins 

10 g sucrose 

0.6% agarose 
pH 5.8 
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restriction endonucleases and then separatd by agarose 
gel electrophoresis. The separated DNA was transferred 
to Hybond H+ membrane and hybridized with radiolabeled 
insert from pBNSF3-2, An estimate of the niimber of 
5 copies of the inserted transgene was made by calibrating 
each Southern blot with standard amounts of pBNSF3-2 
corresponding to 1 and 5 copies per genome and comparing 
intensities of the autoradiographic signal from the 
standards / the endogenous delta-l5 desaturase signals 

10 and the inserted gene signal. To date, 38 Independent 
transformants have been analyzed for presence of the 
gene and 36 were found to be positive. 

The relative content of the 5 most abundant fatty 
acids in canola seeds was determined either by direct 

15 trans-esterification of individual seeds in 0.5 mL of 
methanolic H2SO4 (2.5%) or by hexane extraction of bulk 
seed samples followed by trans-esterification of an 
aliquot in 0-8 mL of 1% sodium methoxide in methanol. 
Fatty acid methyl esters were extracted from the 

20 methanolic solutions into hexane after the addition of 
an equal volume of water. 

The relative content of 18:3 fatty acid varies 
significantly during seed development. To a lesser 
extent, the ratio of 18:3 to 18:2 varies also. Thus 

25 meaningful data can be obtained only from seeds after 

maturation and drydown. Additionally ^ the ratio of 18:3 
to total fatty acid content and to 18t0 varies 
significantly due to environmental factors, primarily 
temperature. In this circumstance, the most appropriate 

30 controls are the transformed plants %diich by Southern 
analysis do not contain -the antisense delta-15 
transgene. Analysis from the first 5 transformants to 
reach dry seed are given in Table 10 below. Seeds were 
harvested using a hand thesher, bulked and a 1.5 g 

35 (about 300 seeds) sample was taken. Seed from each 
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transforxnant was crushed with a mortar and pestel, 
extracted 4 times with 8 xnL hexane at about 50^C. The 
combined extracts were reduced in volume to 5 mL and two 
50 microliter aliquot^ were taken for esterlf Icatlon as 
5 described above. Separation of the fatty acid methyl 
esters was done by gas-llquld chromatography using an 
Qmegawax 320 column (Supelco Inc., 0.32 mm ID X 30M) run 
Isothermally at 220^ and cycled to 260° between each 
Injection. 

TftftTiF. in 



^ \ lB;a 4ift.^/ifl.g Copy HQ, 
pZCC3FdR-91 6.2 0.39 0 
pZCC3FdR-81 5.9 0.33 1 
p2CC3FdR-15 6.0 0.36 2 
pZCC3FdR-ll 5.6 0.34 1 
pZCC3FdR-148 8.2 0.40 2 



10 The differences between the 4 transformed lines and 

line 92 are very small, however to test the significance 
of the difference In the 18: 3/18; 2 ratio between line 81 
and 91, 25 Individual seeds from each line were trans- 
esterlf led and their fatty acid composition determined. 
15 The average ratio for line 81 was 0.345 with a 

coefficient of variation of 11.6% while the average for 
line 91 was 0.375 with a coefficient of variation of 
8.0%. The sample means are significantly different at 
the 0.01% level using Student's t test. 
20 EXAMPLE 13 

CONSTRUCTION OF VECTORS FOR TRANSFORMATION OF 
GI.YCTNE MAX FOR REDUCED EXPRESSION OF DELTA- 15 
DESATORASES 'tW DEVELOPING SEEDS 

The antisense fi. loax plastid delta-15 desaturase 
25 cDNA sequence under control of the ^-conglycinin 
promoter was constructed using the vector pCWlOSA 
described in Example 10 above. For use in the soybean 
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'trsuisfoxna-tion system described below, the trans- 
criptional unit was placed in a vector along with an 
appropriate selectable marker e:^ression system. The 
starting vector was pML45, which consists of the non- 
5 tissue specific and constitutive promoter designated 
508D and described in Hershey fWO 9011361) driving 
ea^ression of the neomycih phosphotransferase gene 
described in (Beck et al. (1982) Gene 19:327-536) 
followed by the 3' end of the nopaline synthase gene 

10 including nucleotides 848 to 1550 described by (Depicker 
et al. (1982) J. Appl. Genet. 1:561-574) » This 
transcriptional unit was inserted into the commercial 
cloning vector pGBM9Z (BRL) and is flanked at the 5' end 
of the 508D promoter by the restriction sites Sal I, Xba 

15 I, Bam HI and Sma I in that order. An additional Sal I 
site is present at the 3' end of the NOS 3 ' sequence and 
the Xba I, Bam HI and Sal I sites are unique. 

Removal of the unit [p-conglycinin promter: cloning 
region :phaseolin 3- end] from pCW109A by digestion with 

20 Hind III, blunting the ends and isolating the 1.8 kB 
fragment afforded the expression cassette pCST by 
ligating the above isolated fragment into the Sma I site 
of pML45. A clone with the p-conglycinin promoter in 
the same orientation as the 508D promoter were chosen by 

25 digestion with Xba I. The correct orientation releases 
a 700 bp fragment. This vector cassette %#as called 
pCST. 

Mie 2.2 kB insert eacoding the so^^ean, plastid 
delta-15 desaturase «»as subdoned from the plasmid pXFl 

30 by digestion with HinP I to remove about 1 kB of 

unrelated cDNA. HinP I- cuts within the cDNA insert very 
near the 5 • end of the cDNA for the delta-15 desaturase 
and about 300 bp from the 3' end of that cDNA. The Cla 
I conipatable ends were cloned into Cla I digested 

35 pBluescript and a clone with the 5' end of the cDNA 
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toward the Eco RV site in the pBluescrlpt cloning region 
was selected based on the relaese of a 900 bp fragment 
by digestion with Pst I . The subcloned plasmld was 
called pS3Fdl. 

5 The delta-15 encoding sequence was removed from 

pS3Fdl by digestion with HlnC II and Eco RV, the 2.2 kB 
fragment was gel Isolated and cloned Into the opened Sma 
I site In pCSTl. A clone with the delta-*15 sequence In 
the antlsense orientation to the |3-conglyclnln promoter 

10 was selected by digestion with Xba I. The antlsense 
construct releases a 400 bp piece and that clone was 
designated pCS3FdSTlR. 

EXAMPLE 14 

TRAKSFQRMATTQN OF SOMATIC SOYBEAN EMBRYO CUI.TURES 

15 Soybean embryogenlc suspension cultures are 

maintained In 35 mL liquid media (SB55 or SBP6) on a 
rotary shaker, 150 rpm, at 28*^0 with mixed florescent 
and Incandescent lights on a 16:8 h day /night schedule. 
Cultures were subcultured every four weeks by 

20 Inoculating approximately 35 mg of tissue Into 35 mL of 
liquid medium. 

Soybean embryogenlc suspension cultures were 
transformed with pCS3FdSTlR by the method of particle 
gun bombardment (see Kline et al. (1987) Nature (London) 

25 327:70). A Du Pont Blollstlc PDSIOOO/HE instrument 

(helium retrofit) was used for these transformations. 

To 50 mL of a 60 mg/mL 1 mm gold particle 
suspension was added (In order); 5 |1L DKA(1 |ig/pi.) , 20 
VLL spermidine (O.IM)^ and 50 \il CaCl2 (2.5 M) . The 

30 particle preparation was agitated for 3 min^ spun in a 
microfuge for 10 sec and* the supernatant removed. The 
DNA-coated particles were then washed once in 400 JIL 70% 
ethanol and resuspended in 40 JIL of anhydrous ethanol. 
The DNA/particle suspension was sonicated three times 
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for 1 sec each. Five |Ui of the DNA-coated gold 
particles w re then loaded on each macro carrier disk. 

Approximately 300-400 mg of a four week old 
suspension culture was placed in an en5>ty 60x15 mm petri 
5 dish and the residual liquid removed from the tissue 
with a pipette. For each transformation experiment, 
approximately 5-10 plates of tissue were normally 
bombarded. Membrane rupture pressure was set at 1000 
psi and the chamber was evacuated to a vacuum of 28 

10 inches of mercury. The tissue *ias placed approximately 
3.5 inches away from the retaining screen and bombarded 
three times. Following bombardment, the tissue was 
placed back into liquid and cultured as described above. 
Eleven days post bombardment, the liquid media was 

15 exchanged with fresh SB55 containing 50 mg/mL 

hygromycin. The selective media was refreshed weekly. 
Seven weeks post banbardment, green, transformed tissue 
was dbserved growing from untransformed, necrotic 
entoryogenic clusters. Isolated green tissue was removed 

20 and inociilated into individual flasks to generate new, 
donally propagated, transformed embryogenic suspension 
cultures. Thus each new line was treated as independent 
transformation event. These suspensions can then be 
maintained as suspensions of embryos clustered in an 

25 immature developmental stage through subculture or 
regenerated into whole plants by maturation and 
germination of individual somatic embryos. 

Transformed embryogenic clusters were removed from 
liquid culture and placed on a solid agar media (SB103) 

30 containing no hormones or antibiotics. Embryos were 
cultured for eight weeks -at 26'C with mixed florescent 
and incandescent lights on a 16:8 h day /night schedule. 
During this period, individual embryos were removed from 
the clusters and analyzed at various stages of embryo 
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development After eight weeks th embryos become 
suitable for germination. 

TftBLE 11 

B5 Vitimin Stock 
10 g m-inositol 
100 mg nicotinic acid 
100 mg pyridoxine HCl 



Media: 

SB55 and SBP6 Stock Solutions 
(g/W : 

MS Sulfate lOOX Stock 
MgSa^ 7H20 37.0 
MnS04 H20 1.69 
ZnS04 7H20 0.86 
CUSO4 5K20 0.0025 

MS Halides 10 OX Stock 
CaCl2 2H2O 44.0 
KI 0.083 
C0CI2 6H2O 0.00125 
KH2PO4 17.0 
H3BO3 0 . 62 

Na2Mo04 aH20 0.025 

MS FeEDTA lOOX Stock 

Na2EDTA 3.724 
FeS04 7H2O 2.784 



1 g thiamine 
SB55 (per Liter) 

10 mL each MS stocks 
1 mL B5 Vitaimin stock 
0.8 g NH4NO3 
3.033 g KNO3 

1 mL 2,4-D (lOmg/mL stock) 
60 g sucrose 

0.667 g asparagine 
pH 5.7 

For SBP6- substitute 0.5 mL 2r4-D 
SB103 (per Liter) 
MS Salts 

6% maltose 
750 mg MgClj 

0.2% Gelrite 
pH 5.7 



EXAMPLE 15 

AKAT,YfiTg OP TRA MfifiRKfTf! GT.VrTNE MAX PLANTS 

5 While In the globular embryo state In liquid 

culture as described in Example 14, somatic soybean 
embryos contain very low amounts of triacylglycerol or 
storage proteins typical of maturing, zygotic soybean 
embryos. At this developmental stage, the ratio of 
10 total triacylglyceride to total polar lipid 

(phospholipids and glycolipid) is about 1:4, as is 
typical of zygotic soybean embryos at the developmental 
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s^age from which the somatic emjbryo culture was 
initiated. At the globular stage as well, the mRNAs for 
the prominant seed proteins (a' subunit of 
P-conglycinin, Kunitz Trypsin Inhibitor III and Soybean 
5 Seed Lectin) are essentially absent. Upon transfer to 
hormone free media to allow differentiation to the 
maturing somatic embryo state as described in Example 
14 f triacylglycerol becomes the most abundant lipid 
class. As well, mRNAs for a*-subunit of p-conglycinin, 

10 Kunitz Trypsin Inhibitor III and Soybean Seed Lectin 
become very abundant messages in the total mRNA 
population. In these respects the somatic soybean 
embryo system behaves very similarly to maturing zygotic 
soybean embryos In vivo- and is therefore a good and 

15 rapid model system for analyzing the phenotypic effects 
of modifying the expression of genes in the fatty acid 
biosynthesis pathway. Similar somatic embryo culture 
systems have been documented and used in another oilseed 
crop, rapeseed (Taylor et al. (1990) Planta 181:18-26) . 

20 Fatty acid analysis was performed as described in 

Exan^le 12 vsing single embryos as the tissue source. A 
number of embryos from line 2872 (control tissue 
transformed with pCST) and lines 299,303,306 and 307 
(line 2872 transformed with plasmid pCS3FdSTlR) were 

25 analyzed for fatty acid content. The relative fatty- 
acid contposition of embryos taken from tissue 
transformed with pCS3FdSTlR was compared with control 
tissue, transformed with pCST. The results of this 
analysis are shown in Table 12 . 







TABLE. .12 










16i0 








1B;3 


2872 1 


17.7 


4.1 


11.3 


52.8 


14.1 


2 


17.3 


4.3 


10.9 


49.5 


18.0 


3 


16.1 


4,1 


13.8 


48.2 


17.3 


4 


17.5 


3.6 


U.7 


52.0 


14.1 
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Eiribryo 


16iQ 






ia;2 


18; 3 




5 


16.6 


3.9 


12.7 


53.7 


12.6 




6 


14.8 


3.0 


14.7 


55.3 


11-1 




av 


16.7 


3.8 


12.5 


51.9 


14.5 


299-1-3 


1 


16.5 


4.1 


9.7 


61.4 


6.3 


299-15-1 


1 


14.7 


3.6 


11.9 


61.3 


8.4 




2 


16.6 


3.7 


12.1 


58.6 


8.6 




3 


16.7 


4.1 


14.9 


53.2 


11.1 




4 


15.2 


4.0 


9.1 


60.2 


11.5 




5 


16.0 


4.2 


13.9 


55.2 


10.7 




6 


15.2 


3.5 


9.9 


63.4 


8.1 


303-7-1 


1 


14.1 


2.2 


10.6 


59.4 


13.7 




2 


14.0 


2.8 


12.5 


59.3 


11.4 


306-4-5 


1 


17.5 


4.2 


8.1 


62.7 


7.4 




215.7 


3.3 


9.0 


60.5 


11.5 






3 


17.1 


3.4 


9.3 


60.7 


9.5 




4 


15.7 


3.8 


9.2 


61.2 


9.7 




5 


17.7 


3.9 


6.5 


58.3 


13.6 




6 


16.6 


3.4 


10.2 


59.2 


10.6 


306-4-8 


1 


16.6 


3.9 


15.3 


50.7 


11.8 




2 


17.8 


3.6 


15.7 


50.0 


10.8 




3 


16.7 


3.3 


11.1 


52.0 


14.6 


•- 


4 


19.0 


4.0 


10.3 


53.1 


12.3 




5 


19.7 


3.5 


9.0 


53.6 


13.0 




6 


18.0 


2.9 


13.1 


52.8 


10.9 


307-1-1 


1 


14.4 


3.7 


11.2 


64.4 


6.3 




2 


15.4 


3.4 


7.8 


61.0 


11.3 




3 


17.2 


2.5 


12.0 


57.2 


11.1 


307-1-2 


1 


13.4 


3.0 


8.4 


55.4 


19.9 




2 


16.3 • 


3.1 


6.4 


55.7 


18.7 




3 


14.0 


3.3 


8.8 


58.7 


15.2 




4 


15.8 


2.5 


9.8 


59.7 


12.2 




5 


14.6 


3.7 


14.9 


51.1 


15.7 




6 


14.3 


3.9 


11.4 


55.5 


14.1 
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TAr^c BBferya ^e^a lAiSL lAil 1^ 

307-1-3 1 14.8 3.1 ^0.5 12.2 

2 18.0 3.0 5.3 56.2 15.2 

3 18.0 3.4 2.5 58.6 15.4 
307-1-4 1 15.0 2.7 13.8 61.7 6.9 

2 15.9 2.7 9.8 62.0 9.6 

3 14.6 3.2 13.4 61.4 6.7 
307-1-5 1 15.9 3.S 7.6 61.7 11.2 

2 14.6 3.5 10.0 61.3 10.6 

3 18.7 2.6 6,8 53.0 19.0^ 
307-1-7 1 15.3 3.5 12.5 60.3 8.5 

2 16.2 2.2 13.9 57.1 10.6 

3 14.9 3.1 12.2 58.0 11.8 
307-1-9 1 16.4 2.9 23.2 47.9 9.6 

2 19.6 0.0 20.4 51.3 8.8 

3 16.8 3.3 24.6 49.6 5.7 
307-1-11 I 18.1 3.6 5.7 52.9 19.7 

2 14.7 3.7 9.9 58.7 13.0 

3 15.1 3.7 11.3 55.8 14.1 

The average 18:3 content of control embryos was 
14.5% with a range from 11.1% to 18.0%. The average 
18:3 content of transformed embryos was 11.5% with a 
5 range of 6,3% to 19.9%. Almost 80% of the transformed 
embryos (38/48) had an 18:3 content below that of the 
control mean. About 44% had an 18:3 content less than 
the lowest observed control value and 12.5% had an 18:3 
content less than half of the control mean value (i.e., 

10 less than 7.5%). The lowest 18:3 content observed in 
transformed tissue was 6.3% (299-1-3, 307-1-2 #1) 
conipared with the control low of 11.1%. m all cases in 
transformed tissue, a decrease in 18:3 content was 
reflected by an equivalent increase in 18:2 content 

15 indicating that the desaturation of 18:2 to 18:3 had 
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been reduced. The relative content of the the other 
fatty acids remained unchanged. 

Southern analysis for the presence of the Intact, 
introduced antlsense construction was performedr as 
5 described in Example 12 using Bam HI cut gDNA, on a 
number of the transformed lines listed below using 
groups of embryos from a single trans foannation event. 
The approximate Intact antlsense copy number was 
estimated from the number and intensity of hybridizing 



bands on 


the autoradlograms and is 


shown in 


Table 13. 






TABLE 13 








Antisense 


18:3 


18:3 


18:2/18:3 


liine No. 


copy Wo» 


flow) 


< average) 


ratlin 


2872 


0 


11.1 


14.5 


3.6 


303-7/1 


1 


11.4 


12.6 


4.7 


307-1/2 


3 


12.2 


16.0 


3.5 


306-4/8 


3 


10.8 


12.2 


4.3 


307-1/7 


4 


8.5 


10.3 


5.7 


306-4/5 


6 


7.4 


10.4 


5.8 


307-1/1 


6 


6.3 


9.6 


6.3 


299-15/1 


7 


8.1 


9.7 


6.1 


307-1/4 


8 


6.7 


7.7 


8.0 



There was a reasonable correlation between intact 
antisense copy number and 18:3 content , an increase in 
copy number correlating with a decreased 18:3 content 
and a consequent increase in the 18:2/18:3 ratio. The 
15 average 18:2/18:3 ratio of line 307-1/4, which had at 
least 8 copies of the antisense cDNA, was more than 
twice that of the control. 
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(i) APPLICANTS: Browse/ Johiir Kinney r Anthony J. r 
Piercer Johnr Wierzbickir Anna K./ 
radavr Narendra S.r Perez-Grau, Luis 
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(2) INFORMATION FOR SEQ ID N0:1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1350 base pairs 

<B} TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
<iii) HYPOTHETICAL: NO 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Arabidopsis thaliana IMMEDIATE SOURCE: 

(B) CLONE: pCF3 

Cix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 46.. 1206 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
CTCTCTCTCT CTCTCTTCTC TCTTTCTCTC CCCCTCTCTC CGGCG ATG GTT GTT 54 



Met Val Val 

1 



GCT ATG GAC CAA CGC ACC AAT GTG AAC GGA GAT CCG GGC GCC GGA GAC 
Ala. Met Asp Gin Arg Thr Asn Val Asn Gly Asp Pro Gly Ala Gly Asp 
5 10 15 



102 



CGG AAG AAA GAA GAA AGG TTT GAT CCG AGT GCA CAA CCA CCG TTC AAG 
Arg Lys Lys Glu Glu Arg Phe Asp Pro Ser Ala Gin Pro Pro Phe Lys 
20 25 30 35 



150 



ATC GGA GAT ATA AGG GCG GCG ATT CCT AAG CAC TGT TGG GTT AAG AGT 
lie Gly Asp lie Arg Ala Ala lie Pro Lys His Cys Trp Val Lys Ser 
40 45 50 



198 



CCT TTG AGA TCA ATG AGT TAC GTC GTC AGA GAC ATT ATC GCC GTC GCG 
Pro Leu Arg Ser Met Ser Tyr Val Val Arg Asp lie lie Ala Val Ala 
55 60 65 



246 



GCT TTG GCC ATC GCT GCC GTG TAT GTT GAT AGC TGG TTC CTT TGG CCT 
Ala Leu Ala lie Ala Ala Val Tyr Val Asp Ser Trp Phe Leu Trp Pro 
70 75 80 



294 



CTT TAT TGG GCC GCC CAA GGA ACA CTT TTC TGG GCC ATC TTT GTT CTC 
Leu Tyr Trp Ala Ala Gin Gly Thr Leu Phe Trp Ala lie Phe Val Leu 
85 90 95 



342 



GGC CAC GAC TGT GGA CAT GGG AGT TTC TCA GAC ATT CCT CTA CTG AAT 
Gly His Asp Cys Gly His Gly Ser Phe Ser Asp lie Pro Leu Leu Asn 
100 105 110 115 



390 
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AGT GTG GTT GGT CAC ATT CTT CAT TCT TIC ATC CTC GTT CCT TAC CAT 438 
Ser Val Val Gly His lie Leu His Ser Phe II Lfeu Val Pro Tyr His 
120 125 130 

GGT TGG AGA ATA AGO CAC CGG ACA CAC CAC CAG AAC CAT GGC CAT GTT 486 
Gly Trp Arg lie Ser His Arg Thr His His Gin Asa His Gly His Val 
135 140 14S 

GftA AAC SAC GAG ICA TGG GTT CCG TTA CCA GAA AGG GTG TAC AAG AAA 534 
Glu Asn Asp Glu Ser Trp Val Pro hen Pro Glu Arg Val Tyr Lys tys 
150 155 1*0 

TTG CCC CAC AGT ACT CGG ATG CTC AGA TAC ACT GIC CCT CTC CCC ATG 582 
Leu Pro His Ser Thr Arg Met Leu Arg Tyr Thr Val Paro Leu Pro Met 
165 170 "5 



CTC GCA TAT CCT CTC TAT TTG TGC TAC AGA AGT CCT GGA AAA GAA G6A 
Leu Ala Tyr Pro Leu Tyr Leu Cys Tyx Arg Ser Pro Gly Lys Glu Gly 
180 185 190 



630 



TCA CAT TTT AAC CCA TAC AGT AGT TTA TTT GCT CCA AGC GAG AGA AAG 678 
ser His Phe Asb Pro Tyr Ser Ser Leu Phe Ala Pro Ser Glu Arg Lys 
200 205 210 

CTT ATT GCA ACT TCA ACT ACT TGT TGG TCC ATA ATG TTC GTC AGT CTT 726 
Leu lie Ala Thr Ser Thr Thr Cys Trp Ser He Met Phe Val Ser Leu 
215 220 225 

ATC GCT CTA TCT TTC GTC TTC GGT CCA CTC GCG GTT CTT AAA GTC TAC 774 
He Ala Leu Ser Phe Val Phe Gly Pro Leu Ala Val Leu Lys Val Tyr 
230 235 240 

GGT GIA CCG TAC ATT ATC TTT GTG ATG TGG TTG GAT GCT GTC ACG TAT 
Gly Val Pro Tyr He lie Phe Val Met Trp Leu Asp Ala Val Thr Tyr 
245 250 255 

TTG CAT CAT CAT GGT CAC GAT GAG AAG TTG ^ ^ ^ 

Leu His His His Gly His ' " * "* 

260 265 

GAA TGG AGT TAT CTA CGT GGA GGA TTA ACA ACA ATT GAT AGA GAT TAC 
leu * * ""^ 

280 



822 



870 



918 



966 



TTG CAT CAT CAT BGI CAt sua: wua **w - 

Leu His His His Gly His Asp Glu Lys Leu Pro Trp Tyr Arg Gly Lys 

270 

GAA TGG AGT TAT CTA C6T BSA xx* «wv ACA ATT GAT AGA GAT TAC 

Glu Trp ser Tyr leu Arg Gly Gly Leu Thr Thr He Asp Arg Asp Tyr 
280 285 Z'O 

eSA ATC TTT AAC AAC AIT CAT CAC GAC ATT GGA ACT CAC GTG ATC CAT 
S S ^ ^ i« II- His His ASP He Sly Thr His Val He His 
295 300 3"* 

CAT CTC TTC CCA CAA ATC CCT CAC TAT CAC TTG GTC GAC 6CC ACS AAA 1014 
Ss 2u Se pro ^ He Pro His .Tyr His Leu Val Asp Ala Thr Lys 
310 315 3*0^ 

GCA GCT AAA CAT GTG TTG GGA AGA TAC TAC AGA GAA CCA AAG ACG TCA 1062 
S Sa His val Leu Gly Arg Tyr Tyr Arg Glu Pro Lys Thr Ser 
325 330 335 
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GGA GCA ATA CCG ATC CAC TTG GTG GAG AGT TTG GTC GCA AGT ATT AAG 1110 
Gly Ala lie Pro He His I«eu Val Glu Ser Leu Val Ala Ser He Lys 
340 345 350 355 

AAA GAT CAT TAG GTC AGO GAG ACT GGT GAT ATT GTC TTC TAC GAG ACA 1158 
Lys Asp His Tyr Val Ser Asp Thr Gly Asp He Val Phe Tyr Glu Thr 
360 365 370 

GAT CCA GAT CTC TAC GTT TAC GCT TCT GAC AAA TCT AAA ATC AAT TAATCTCCAT 1213 
Asp Pro Asp Leu Tyr Val Tyr Ala Ser Asp Lys Ser Lys He Asn 
375 380 385 

TTGTTTAGCT CTATTAGGAA TAAACCAGCC CACTTTTAAA ATTTTTATTT CTTGTTGTTT 1273 

TTAAGTTAAA AGTGTACTCG TGAAACTCTT TTTTTTTTCT TTTTTTTTAT TAATGTATTT 1333 

ACATTACAAG 6CGTAAA 1350 

(2) IKFOBMATXON FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 386 amino acids 

(B) TYPE: axedno acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:2: 

Met Val Val Ala Met Asp Gin Arg Thr Asn Val Asn Gly Asp Pro Gly 
15 10 15 

Ala Gly Asp Arg Lys Lys Glu Glu Arg Phe Asp Pro Ser Ala Gin Pro 
20 25 30 

Pro Phe Lys He Gly Asp He Arg Ala Ala He Pro Lys His Cys Trp 
35 40 45 

Val Lys Ser Pro Leu Arg Ser Met Ser Tyr Val Val Arg Asp He He 
50 55 60 

Ala Val Ala Ala Leu Ala He Ala Ala Val Tyr Val Asp Ser Trp Phe 
65 70 75 80 

Leu Trp Pro Leu Tyr Trp Ala Ala Gin Gly Thr Leu Phe Trp Ala He 
85 90 95 

Phe val Leu Gly His Asp Cys Gly His Gly Ser Phe Ser Asp He Pro 
100 '105 110 

Leu Leu Asn Ser Val Val Gly His He Leu His Ser Phe He Leu Val 
115 120 125 

Pro Tyr His Gly Trp Arg He Ser His Arg Thr His His Gin Asn His 
130 135 140 
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Gly His Val Glu Asn Asp Glu Ser Trp Val Fro Leu Pro Glu Arg Val 
145 150 155 160 

Tyr Lys hys I«u Pro His Ser Thr Arg Met Leu Arg Tyr Thr Val Pro 
165 170 1"5 

Leu Pro Met Leu Ala Tyr Pro Leu Tyr Leu Cys Tyr Arg Ser Pro Sly 
180 185 190 

Lvs Glu Gly Ser His Phe Asn Pro Tyr Ser Ser Leu Phe Ala Pro Ser 
' 195 200 205 

Glu Arg Lys Leu He Ala Thr Ser Thr Thr C^s Trp Ser He Met Phe 
210 215 22a 

Val ser Leu He Ala Leu Ser Phe Val Phe Gly Pro Leu Ala val Leu 
225 230 235 24& 

Lys Val Tyr Gly Val Pro Tyr He He Phe Val Met Trp Leu A^ Ala 
245 250 255 

Val Thr Tyr Leu His His His Gly His Asp Glu Lys Leu Pro Trp Tyr 
260 265 270 

Arg Gly Lys Glu Trp Ser Tyr Leu Arg Gly Gly Leu Thr Thr He Asp 
275 280 285 

Aro Asp Tyr Gly He Phe Asn Asn He His His Asp He Gly Thr His 
290 295 300 

Val He His His Leu Phe Pro Gin He Pro His Tyr His Leu Val Asp 
305 310 315 320 

Ala Thr Lys Ala Ala Lys His Val Leu Gly Arg Tyr Tyr Arg Glu Pro 
325 330 335 

LVS Thr Ser Gly Ala He Pro He His Leu Val Glu Ser Leu Val Ala 
' 340 345 350 

Ser He lys Lys Asp His Tyr Val Ser Asp Thr Gly Asp He Val Phe 
355 360 365 

Tyr Glu Thr Asp Pro Asp Leu Tyr Val Tyr Ala Ser Asp Lys Ser Lys 
370 375 380 

He Asn 
385 

(2> HJPORMATION for SEQ id M0:3:. 

(i) SEQOEMCE CHARACTERISTICS: 

CA) LBNCTH: 255 base pairs 

(B) TOTE: nucleic acid 

(C) STRANDEDMESS: double 

(D) TOPOLOGY: linear 
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<ii) 



MOLECULE TYPE: DNA (g nomlc) 



(iil) 



HYPOTHETIOO.: NO 



(vi) 



ORIGItUU. SOUl^: 



V 



(A) ORGANISM: Arabidopsis thaliana 



(vii) IMMEDIATE SOURCE: 
(B) CLONE: pFl 
<ix) FEATURE: 

(A) N2^/KEY: exon 

(B) LOCATION: 68*. 255 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

AAATTCATCA AACCCTTTCT TCACCACATT ATTTTCACTG AGCGCATAAC ATTTTTGAGA 60 

CAAGAGACTC TCTCTCTCTC TCTTCTCTCT TTCTCTCCCC CTCTCTCCGG CGATGGTTGT 120 

TGCTATGGAC CAACGCACCA ATGTGAACGG AGATCCCGGC GCCGGAGACC GGAAGAAAGA 180 

AGAAAGGTTT GATCCGAGTG CACAACCACC GTTCAAGATC GGAGATATAA GGGCGGCGAT 240 

TCCTAAGCAC TGTTG 255 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1525 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 
(iii) HYPOTHETICAL: NO 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Arabidopsis thaliana 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: pACF2-2^ 



(ix) 



FEATURE: 



(A) NAME/KEY: CDS 

(B) LOCATION: 10.. 1350 
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(xi> SEQOENCS DESCRIPTION: SEQ ID MO; 4; 

CAAGTTCXA ATG GCG AAC TTG GTC TTA TCA GRA TGT GGT ATA C6A CCT 48 
Met Aia Asn Leu Val Leu Ser Glu C^s Gly He Arg Peo 
1 5 10 

CTC CCC AGA ATC XAC ACA ACA CCC AGA TCC AAT TTC CTC TCC AAC AAC 96 
Leu Pr© Arg He Tyr Thx Thr Pro Arg Ser Asn Phe Leu Ser Asn Asn 
15 20 25 

AAC AAA TTC AGA CCA TCA CTT tCT TCT TCT TCT TAC AAA ACA TCA TCA 144 
Asn Lys Phe Arg Pro Ser Leu Ser Ser Ser Ser Tyr Lys Thr Ser Ser 
30 35 40 *5 

TCT CCT CTG TCT TTT GGT CTG AAT TCA CGA GAT GG6 TTC ACG AGO AAT 192 
sS Pro Leu ser Phe Gly Leu Asn Ser Arg Asp Gly Phe Thr Arg Asn 
50 55 

TGG GCG TTG AAT GTG AGC ACA CCA TTA ACG ACA CCA ATA TTT GAG ^ 
Trp Ala Leu Asn Val Ser Thr Pro Leu Thr Thr Pro He Phe Glu Glu 
65 70 75 

TCT CCA TTG GAG GAA GAT AAT AAA CA6 AGA TTC GAT CCA GGT GCG CCT 
Ser Pro Leu Glu Glu Asp Asn Lys Gin Arg Phe Asp Pro Gly Ala Pro 
80 85 90 

CCT CCG TTC AAT TTA GCT GAT ATT AGA GCA GCT ATA CCT AAG CAT TGT 
Pro Pro Phe Asn Leu Ala Asp He Arg Ala Ala He Pro Lys His Cys 
95 100 105 

TGG GTT AAG AAT CCA TGG AAG TCT TTG AGT TAT GTC GTC AGA GAC GTC 
tS val Lys Asn Pro Trp lys Ser Leu Ser Tyr Val Val Arg Asp Val 
110 115 120 125 

GCT ATC GTC TTT GCA TTG GCT GCT GGA GCT GCT TAC CTC AAC AAT TGG 
Ala He val Phe Ala Leu Ala Ala Gly Ala Ala Tyr Leu Asn Asn Trp 
130 135 l*" 

ATT GTT TGG CCT CTC TAT TGG CTC GCT CAA GGA ACC ATG TTT TGG GCT 
lie val Trp Pro Leu Tyr Trp Leu Ala Gin Gly Thr Met Phe Trp Ala 
145 150 

CTC TTT GTT CTT GGT CAT SAC TGT GGA CAT GGT AGT TTC TCA AAT 6^ 528 
Su Phe val Leu Gly His Asp Cys Gly His Gly Ser Phe Ser Asn Asp 
160 165 "0 

CCG AAG TTG AAC AGT GTG GTC GGT CAT CTT CTT CAT TCC TCA ATT CTC 576 
Pro Lys Leu Asn Ser Val Val Gly His Leu Leu His Ser Ser He Leu 
175 180 185 

GTC CCA TAC CAT GGC TGG AGA ATT AGT CAC AGA ACT 624 ^ 

val Pro Tyr His Gly Trp Arg He Ser His- Arg Thr His His Gin Asn 
190 195 200 205 

CAT GGA CAT GTT GAG AAT GAC GAA TCT TGG CAT CCT ATG TCT GAG AAA 672 * 

His Gly His val Glu Asn Asp Glu Ser Trp His Pro Met Ser Glu Lys 
210 215 *20 



240 



288 



336 



384 



432 
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ATC TAG AAT ACT TTG GAC AAG CC6 ACT AGA TTC TTT AGA TTT ACA CTG 720 
He Tyr Asn Thr Leu Asp Lys Pro Thr Arg Ph Phe Arg Phe Thr I«eu 
225 230 235 

OCT CTC GTG ATG CTT GCA TAC COT TTC TAG TTG TGG GGT GGA AGT CGG 768 
Pro Leu Val Met Leu Ala Tyr Pro Phe Tyr Leu Trp Ala Arg Ser Pro 
240 245 250 

GGG AAA AAG GGT TCT CAT TAC CAT CCA GAC AGT GAG TTG TTG CTG GGT 816 
Gly Lys Lys Gly Ser His Tyr His Pro Asp Ser Asp Leu Phe Leu Pro 
255 260 265 

AAA GAG AGA AAG GAT 6TC CTC ACT TCT ACT GGT TGT TGG ACT GCA ATG 664 
Lys Glu Arg Lys Asp Val Leu Thr Ser Thr Ala Cys Trp Thr Ala Met 
270 275 280 285 

GCT GGT CTG CTT GTT TGT CTG AAG TTG AGA ATG GGT GGA ATT CAA ATG 912 
Ala Ala Leu Leu Val Cys Leu Asn Phe Thr He Gly Pro He Gin Met 
290 295 300 

CTC AAA CTT TAT GGA ATT CCT TAC TGG ATA AAT GTA ATG TGG TTG GAC 960 
Leu Lys Leu Tyr Gly He Pro Tyr Trp He Asn Val Met Trp Leu Asp 
305 310 315 

TTT GTG ACT TAC CTG CAT GAG CAT GGT CAT GAA GAT AAG CTT CCT TGG 1008 
Phe Val Thr Tyr Leu His His His Gly His Glu Asp Lys Leu Pro Trp 
320 325 330 

TAG GGT GGG AAG GAG TGG AGT TAG CTG AGA GGA GGA GTT AGA AGA TTG 1056 
Tyr Arg Gly Lys Glu Trp Ser Tyr Leu Arg Gly Gly Leu Thr Thr Leu 
335 340 345 

GAT GGT GAC TAC GGA TTG ATC AAT AAG ATG CAT CAT GAT ATT GGA ACT 1104 
Asp Arg Asp Tyr Gly Leu He Asn Asn He His His Asp He Gly Thr 
350 355 360 365 

CAT GTG ATA CAT CAT CTT TTC CCG GAG ATC CCA CAT TAT CAT CTA GTA 1152 
His Val He His His Leu Phe Pro Gin He Pro His Tyr His Leu Val 
370 375 380 

GAA GGA ACA GAA GCA GCT AAA CCA GTA TTA GGG AAG TAT TAG AGG GAG 1200 
Glu Ala Thr Glu Ala Ala Lys Pro Val Leu Gly Lys Tyr Tyr Arg Glu 
385 390 395 

CCT GAT AAG TCT GGA CCG TTG CCA TTA CAT TTA CTG GAA ATT CTA GGG 1248 
Pro Asp Lys Ser Gly Pro Leu Pro Leu His Leu Leu Glu He Leu Al 
400 405 410 

AAA AGT ATA AAA GAA GAT CAT TAC GTG AGG GAC GAA GGA GAA GTT GTA 1296 
Lys Ser He Lys Glu Asp His Tyr Val Ser Asp Glu Gly Glu Val Val 
415 420 ' 425 

TAG TAT AAA GGA GAT CCA AAT GTG TAT GGA GAG GTG AAA GTA AGA GGA 1344 
Tyr Tyr Lys Ala Asp Pro Asn Leu Tyr Gly Glu Val Lys Val Arg Ala 
430 435 440 445 

GAT TGAAATGAAG CAGGCTTGAG ATTGAAGTTT TTTCTATTTC AGACCAGCTG 1397 
Asp 
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TACTGIATCA ATTTATTGTG TCACCCACCA GAGAGTTA6T ATCTCTGAAT 1457 
ACGATCGATC AGATGGAAAC AACAAATTTG TTTGCGATAC TGAAGCTATA TATACCATAC 1517 
ATTGGAT7 

(2) INFORMATION FOR SEQ ID NO: 5: 

(1) SBQUENCB CHARACTERISTICS: 

(A) LENGTH: 446 amino «icids 
(S) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCEIIPTION: SEQ ID NO: 5: 

Met Ala Asn Leu Val Leu Ser Glu Cys Gly lie Arg.Pro Leu Pro Arg 
1 5 10 15 

lie Tyr Thr Thr Pro Arg Ser Asn Phe Leu Ser Asn Asn Asn Lys Phe 
20 25 30 

Ara Pro Ser Leu Ser Ser Ser Ser Tyr Lys Thr Ser Ser Ser Pro Leu 
35 40 45 

Ser Phe Gly Leu Asn Ser Arg Asp Gly Phe Thr Arg Asn Trp Ala Leu 
50 55 60 

Asn Val Ser Thr Pro Leu Thr Thr Pro lie Phe Glu Glu Ser Pro Leu 
65 70 75 80 

Glu Glu Asp Asn Lys Gin Arg Phe Asp Pro Gly Ala Pro Pro Pro Phe 
85 50 35 

Asn Leu Ala Asp lie Arg Ala Ala He Pro Lys His Cys Trp Val Lys 
100 105 110 

Asn Pro Trp Lys Ser Leu Ser Tyr Val Val Arg hsp Val Ala lie Val 
115 120 125 

Phe Ala Leu Ala Ala Gly Ala Ala 3^r Leu Asn Asn Trp He Val Trp 
130 135 140 

Pro Leu Tyr Trp Leu Ala Gin Gly Thr Mfet Phe Trp Ala Leu Phe Val 
145 150 155 1« 

Leu Gly His Asp Cys Gly His Gly Ser Phe Ser Asn Asp Pro Lys Leu 
165 170 175 

Asn Ser Val Val Gly His Leu Leu His Ser Ser He Leu Val Pro Tyr 
180 185 150 

His Gly Trp Arg He Ser His Arg Thr BLs His Gin Asn His Gly His 
195 200 205 



1525 r%! 
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Val €lu Aan Asp Glu Sex Trp His Pro Met Ser Glu Lys II Tyr Asn 
210 215 220 

Thr Leu Asp Lys Pro Thr Arg Phe Phe Arg Phe Thr Leu Pro Leu Val 
225 230 235 240 

Met Leu Ala Tyr Pro Phe Tyr Leu Trp Ala Arg Ser Pro Gly Lys Lys 
245 250 255 

Gly Ser His Tyr His Pro Asp Ser Asp Leu Phe Leu Pro Lys Glu Arg 
260 265 270 

Lys Asp Val Leu Thr Ser Thr Ala C^s Trp Thr Ala Met Ala Ala Leu 
275 280 285 

Leu Val Cys Leu Asn Phe Thr lie Gly Pro lie Gin Met Leu Lys Leu 
290 295 300 

Tyr Gly lie Pro Tyr Trp He Asn Val Met Trp Leu Asp Phe Val Thr 
305 310 315 320 

Tyr Leu His His His Gly His Glu Asp Lys Leu Pro Trp Tyr Arg Gly 
325 330 335 

Lys Glu Trp Ser Tyr Leu Arg Gly Gly Leu Thr Thr Leu Asp Arg Asp 
340 345 350 

Tyr Gly Leu He Asn Asn He His His Asp He Gly Thr His Val He 
355 360 365 

His His Leu Phe Pro Gin He Pro His Tyr His Leu Val Glu Ala Thr 
370 375 380 

Glu Ala Ala Lys Pro Val Leu Gly Lys Tyr Tyr Arg Glu Pro Asp Lys 
385 390 395 400 

Ser Gly Pro Leu Pro Leu His Leu Leu Glu He Leu Ala Lys Ser He 
405 410 415 

Lys Glu Asp His Tyr Val Ser Asp Glu Gly Glu Val Val Tyr Tyr Lys 
420 425 430 



Ala Asp Pro Asn Leu Tyr Gly Glu Val Lys Val Arg Ala Asp 
435 440 445 



(2) INFORMATION FOR SBQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1429 base pairs 

(B> TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) I^LECULE TYPE: CDMA 
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(111) KXPOTHETZCAL: NO 
(vl) ORIGINAL SOimCE: 

(A) ORGANISM: Brasslca napus 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: pBNSF3-f2 



(lac) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 79. .12X2 

(xi) SEOUENCE DESCRIPTION: SEQ ID NO: 6: 

TTCAAATTCA GACAATCCCC TTCTTCTCCC CGGTTTCGTC TGAACTCTCG AAACTGGGCG 60 

TTGAATGTAA CCACACCT CTA ACA GTC GAC TCC TCA TCA TCT CCT CCA ATC HI 

Leu Thr Val Asp Ser Ser Ser Ser Pro Pro He 
1 5 10 

GAG GAA GAA CCC AAA ACG CAG AGA TTC GAC CCA GGC GCT CCT CCT CCG 159 
Glu Glu Glu Pro Lys Thr Gin Arg Phe Asp Pro Gly Ala Pro Pro Pro 
15 20 25 

TTC AAC CTA GCT GAC ATC AGA GCG GCG ATA CCT AAG CAT XGC TGG GTT 207 
Phe Asn Leu Ala Asp He Arg Ala Ala He Pro Lys His Cys Trp Val 
30 35 40 

AAG AAT OCA TGG AAG TCT ATG AGT TAC GTC GTC AGA GAG CTA GCC ATC 255 
LYS ASH Pro Trp Lys Ser Met Ser Tyr Val Val Arg Glu Leu Ala He 
45 50 55 



GTG TTC GCA CTA GCT GCT GGA GCT GCT TAC CTC AAC AAT TGG CTT GTT 
Val Phe Ala Leu Ala Ala Gly Ala Ala Tyr Leu Asn Asn Trp Leu Val 



65 70 75 



303 



60 

TGG CCT CTC TAT TGG ATT GCT CAA GGA ACC ATG TTC TGG GCT CTC TTT 351 
Trp Pro Leu Tyr Trp He Ala Gin Gly Thr Met Phe Trp Ala Leu Phe 
80 85 90 

GTT CTT GGC CAT GAC TGT GGA CAT GGA AQC TTC TCA AAT GAT CCG AGG 399 
Val Leu Gly His Asp Qys Gly His Gly Ser Phe Ser Asn Asp Pro Arg 
95 100 105 

TTG AAC AGT GTG GTG GGT CAC CTT CTT CAT TCC TCT ATT CTA GTC CCT 447 
Leu Asn Ser Val Val Gly His Leu Leu Els Ser Ser He Leu Val Pro 
110 115 . 120 

TAC CAT GGC TGG AGA ATT AGC CAC AGA ACT CAC CAC CAG AAC CAT GGA 495 
^ His Gly Trp Arg He Ser His Arg Thr His His Gin Asn His Gly 
125 130 135 

CAT GTT GAG AAC GAT GAA TCT TGG CAT CCT ATG TCT GAG AAA ATC TAC 543 
His Val Glu Asn Asp Glu Ser Trp His Pro Met Ser Glu Lys He Tyr 
140 145 150 155 
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AAG AGT TTG GAC AAA CCC ACT CGG TTC TTT AGA TTT ACA TTG CCT CTC 591 
Lys Ser Leu Asp Lys Pro Thr Arg Phe Phe Arg Phe Thr Leu Pro Leu 
160 165 170 

GTG ATG CTC GCT TAC CCT TTC TAC TTG TGG GCA AGA AGT CCA GGG AAG 639 
Val Met Leu Ala Tyr Pro Phe Tyr Leu Trp Ala Arg Ser Pro Gly Lys 
175 180 185 

AAG GGT TCT CAT TAC CAT CCA GAC AGC GAC TTG TTC CTT CCT AAA GAG 667 
Lys Gly Ser His Tyr His Pro Asp Ser Asp Leu Phe Leu Pro Lys Glu 
190 195 200 

AGA AAC GAT GTT CTC ACT TCT ACC GCT TGT TGG ACT GCA ATG GCT GTT 735 
Arg Asn Asp Val Leu Thr Ser Thr Ala Cys Trp Thr Ala Met Ala Val 
205 210 215 

CTG CTT GTC TGT CTC AAC TTC GTG ATG GGT CCA ATG CAA ATG CTC AAA 783 
Leu Leu Val Cys Leu Asn Phe Val Met Gly Pro Met Gin Met Leu Lys 
220 225 230 235 

CTT TAT GTC ATT CCT TAC TGG ATA AAT GTA ATG TGG TTG GAC TTT GTG 831 
Leu Tyr Val lie Pro Tyr Trp He Asn Val Met Trp Leu Asp Phe Val 
240 245 250 

ACT TAC CTG CAT CAC CAT GGT CAT GAA GAT AAG CTC CCT TGG TAC CGT 879 
Thr Tyr Leu His His His Gly His Glu Asp Lys Leu Pro Trp Tyr Arg 
255 260 265 

GGG AAG GAA TGG AGT TAC TTG AGA GGA GGA CTT ACA ACA TTG GAC CGG 927 
Gly Lys Glu Trp Ser Tyr Leu Arg Gly Gly Leu Thr Thr Leu Asp Arg 
270 275 280 

GAC TAC GGA TTG ATC AAC AAC ATC CAT CAC GAC ATT GGA ACT CAT GTG 975 
Asp Tyr Gly Leu He Asn Asn He His His Asp He Gly Thr His Val 
285 290 295 

ATA CAT CAT CTT TTC CCT CAG ATC CCA CAT TAT CAT CTA GTA GAA GCA 1023 
He His His Leu Phe Pro Gin He Pro His Tyr His Leu Val Glu Ala 
300 305 310 315 

ACA GAA GCA GCT AAA CCA GTA TTA GGG AAG TAT TAT AGG GAG CCT GAT 1071 
Thr Glu Ala Ala Lys Pro Val Leu Gly Lys Tyr Tyr Arg Glu Pro Asp 
320 325 330 

AAG TCT GGA CCT TTG CCA TTA CAT TTA CTG GGA ATC TTA GCA AAA AGT 1119 
Lys Ser Gly Pro Leu Pro Leu His Leu Leu Gly He Leu Ala Lys Ser 
335 340 345 

ATT AAA GAA GAT CAT TTT GTG AGC GAT GAA GGA GAT GTT GTA TAC TAT 1167 
He Lys Glu Asp His Phe Val Ser Asp Glu Gly Asp Val Val Tyr Tyr 
350 355 360 

GAA GCA GAC CCT AAT CTC TAT GGA GAG ATC AAG GTA ACA GCA GAG 1212 
Glu Ala Asp Pro Asn Leu Tyr Gly Glu He Lys Val Thr Ala Glu 
365 370 375 



TGAAATGAAG CTGTCAGATT TATCTATTTC TGACCAGCTG ATTTTTTTTG CTTATTAATG 



1272 
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TCAATTCATT GTCTTACCAT TATCTCTGAA TACAATCAGA TGGAAACCCC AACTTTGTTT 1332 
TC^ATACTXG AAGCTATATA TATATATATA TATGTAAGAT ACATTGTATT GTCATTAGAT 1392 
TCACCATICT CAAGGTTCM ATACAAAAAA AAAAAAA ^429 

(2) INFORMATIOJr FOR SEQ ID NO: 7: 

(1) SEQXJENCB CHARACTERISTICS: 

(A> LENGTH: 378 amino acids 
(B) TYPE: amino acid 
(D> TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xl) SEQUENCE DESCRIPTION: SEQ IP NO: 7? 

Leu Thr Val Asp Ser Ser Ser Ser Pro Pro He Glu Glu Glu Pro Lys 
1 5 10 15 

Thr Gin Arg Phe Asp Pro Gly Ala Pro Pro Pro Phe Asn Leu Ala Asp 
20 25 30 

He Arg Ala Ala He Pro Lys His Cys Trp Val Lys Asn Pro Trp Lys 
35 40 45 

Ser Met Ser Tyr Val Val Arg Glu Leu Ala He Val Phe Ala Leu Ala 
50 55 60 

Ala Gly Ala Ala Tyr Leu Asn Asn Trp Leu Val Trp Pro Leu Tyr Trp 
65 70 75 80 

He Ala Gin Gly Thr Mfet Phe Trp Ala Leu Phe Val Leu Gly His Asp 
85 90 95 

Cys GlY His Gly Ser Phe Ser Asn Asp Pro Arg Leu Asn Ser Val Val 
100 105 110 

Gly His Leu Leu His Ser Ser He Leu Val Pro Tyr His Gly Trp Arg 
115 120 125 

He Ser His Arg Thr His His Gin Asn His Gly His Val Glu Asn Asp 
130 135 1*0 

Glu Ser Trp His Pro Met Ser Glu Lys He Tyr Lys Ser Leu Asp Lys 
145 150 155 150 

Pro Thr Arg Phe Phe Arg Phe Thr Leu Pro Leu Val Met Leu Ala Tyr 
165 170 175 

Pro Phe Tyr Leu Trp Ala Arg Ser Pro Gly Lys Lys Gly Ser His Tyr 
ISO 185 150 

His Pro Asp Ser Asp Leu Phe Leu Pro Lys Glu Arg Asn Asp Val Leu 
195 200 205 
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Thr Ser Thr Ala Cys Trp Thr Ala Met Ala Val Leu Leu Val Cys Leu 
210 215 220 

Asn Phe Val Met 61y Pro Met Gin Met Leu Lys Leu Tyr Val lie Pro 
225 230 235 240 

Tyr Trp lie Aan Val Met Trp Leu Asp Phe Val Thr Tyr Leu His His 
245 250 255 

His 61y His Glu Asp Lys Leu Pro Trp Tyr Arg Gly Lys Glu Trp Ser 
260 265 270 



Tyr Leu Arg Gly Gly Leu Thr Thr Leu Asp Arg Asp Tyr Gly Leu lie 
275 280 285 

Asn Asn lie His His Asp lie Gly Thr His Val lie His His Leu Phe 
290 295 300 

Pro Gin lie Pro His Tyr His Leu Val Glu Ala Thr Glu Ala Ala Lys 
305 310 315 320 

Pro Val Leu Gly Lys Tyr Tyr Arg Glu Pro Asp Lys Ser Gly Pro Leu 
325 330 335 

Pro Leu His Leu Leu Gly He lieu Ala Lys Ser He Lys Glu Asp His 
340 345 350 

Phe Val Ser Asp Glu Gly Asp Val Val Tyr Tyr Glu Ala Asp Pro Asn 
355 360 365 



Leu Tyr Gly Glu He Lys Val Thr Ala Glu 
370 375 



(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQX7ENCE CHARACTERISTICS: 

<A> LENGTH: 1429 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(vi) ORIGINAL SamCBz 

(A) ORGMISM: Brassica napus 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: pBNSFd-2 
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iix) FE&TURE: 

(A) NAME/KEY: CDS 

<B) LOCATION : 1 . • 1215 

(xl) SBOHENCE DESCRIPTION: SEQ ID NO; 8: Q 

TTC AAA TTC AGA CAA TCC CCX XCT TCT CCC CGG TTT CGT CTG AAC TCT 43 
Phe Lya Phe Arg Gin Ser Pro Ser Ser Pro Arg- Phe Arg Leu Asxv Ser 

X 5 la 15 ' y 

CGA AAC TGG GOG TTG AAT GTA ACC ACA CCT CTA ACA GTC SAC TCC TCA 96 
Arg Asn Trp Ala Leu Asn Val Thr Thr Pro Leu Thr Val Asp Ser Ser 
20 25 30 

TCA TCT CCT CCA ATC GAG GAA GAA CCC AAA ACG CAG AGA TTC GAC CCA 144 
Ser Ser Pro Pro He Glu Glu Glu Pro Lys Thr Gin Arg Phe Asp Pro 
35 40 « 

GGC GCT CCT CCT CCG TTC AAC CTA GCT GAC ATC AGA GCG GCG ATA CCT 192 
Gly Ala Pro Pro Pro Phe Asn Leu Ala Asp He Arg Ala lUa He Pro 
50 55 60 

AAG CAT TGC TGG GTT AAG AAT CCA TGG AAG TCT ATG AGT TAC GTC GTC 240 
Lys His Cys Trp Val Lys Asn Pro Trp Lys Ser Met Ser Tyr Yal Val 
65 70 75 80 

AGA GAG CTA GCC ATC GTG TTC GCA CTA GCT GCT GGA GCT GCT TAC CTC 288 
Arg Glu Leu Ala He Val Phe Ala Leu Ala Ala Gly Ala Ala Tyr Leu 
85 90 95 

AAC AAT TGG CTT GTT TGG CCT CTC TAT TGG ATT GCT CAA GGA ACC ATG 336 
Asn Asn Trp Leu Val Trp Pro Leu Tyr Trp He Ala Gin Gly Thr Met 
100 105 110 

TTC TGG GCT CTC TTT GTT CTT GGC CAT GAC TGT GGA CAT GGA AGC TTC 384 
Phe Trp Ala Leu Phe Val Leu Gly His Asp Cys Gly His Gly Ser Phe 
115 120 125 

TCA AAT GAT CCG AGG TTG AAC AGT GTG GTG GGT CAC CTT CTT CAT TCC 432 
Ser Asn Asp Pro Arg Leu Asn Ser Val Val Gly His Leu Leu His Ser 
130 135 140 

TCT ATT CTA GTC CCT TAC CAT GGC TGG ASA ATT AGC CAC AGA ACT CAC 
Ser He Leu Val Pro Tyr His Gly Trp Arg He Ser His Arg Thr His 
145 150 155 IM 



480 



528 



CAC CAG AAC CAT GGA CAT GTT GAG AAC QIT GAA TCT TGG CAT CCT ATG 
His Gin Asn His Gly His Val Glu Asn Asp Glu Ser Trp His Pro Met 
165 . 170 175 

TCT GAG AAA ATC TAC AAG AGT TTG GAC AAA CCC ACT CGG TTC TTT AGA 576 
Ser Glu Lys He Tyr Lys Ser Leu Asp Lys Pro Thr Arg Phe Phe Arg 
180 185 190 

TTT ACA TTG CCT CTC GTG ATG CTC GCT TAC CCT TTC TAC TTG TGG GCA 624 
Phe Thr Leu Pro Leu Val Met Leu Ala Tyr Pro Phe Tyr Leu Trp Ala 
195 200 205 
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AGA AGT CCA GGG AAG AA6 GOT TCT CAT TAC CAT CCA GAC AGC GAC TTG 
Arg Ser Pro Gly Lys Lys Gly Ser His Tyr His Pro Asp Ser Asp Leu 
210 215 220 



672 



TTC CTT CCT AAA GAG AGA AAC GAT GTT CTC ACT TCT ACC GCT TGT TGG 
Phe Leu Pro Lys Glu Arg Asn Asp Val Leu Thr Ser Thr Ala Cys Trp 
225 230 235 240 



720 



ACT GCA ATG GCT GTT CTG CTT GTC TGT CTC AAC TTC GTG ATG GGT CCA 
Thr Ala Met Ala Val Leu Leu Val Cys Leu Asn Phe Val Met Gly Pro 
245 250 255 



768 



ATG CAA ATG CTC AAA CTT TAT GTC ATT CCT TAC TGG ATA AAT GTA ATG 816 
Met Gin Met Leu Lys Leu Tyr Val lie Pro Tyr Trp lie Asn Val Met 
260 265 270 

TGG TTG GAC TTT GTG ACT TAC CTG CAT CAC CAT GGT CAT GAA GAT AAG 864 
Trp Leu Asp Phe val Thr Tyr Leu His His His Gly His Glu Asp Lys 
275 280 285 

CTC CCT TGG TAC CGT GGG AAG GAA TGG AGT TAC TTG AGA GGA GGA CTT 912 
Leu Pro Trp Tyr Arg Gly Lys Glu Trp Ser Tyr Leu Arg Gly Gly Leu 
290 295 300 

ACA ACA TTG GAC CGG GAC TAC GGA TTG ATC AAC AAC ATC CAT CAC GAC 960 
Thr Thr Leu Asp Arg Asp Tyr Gly Leu lie Asn Asn lie His His Asp 
305 310 315 320 

ATT GGA ACT CAT GTG ATA CAT CAT CTT TTC CCT CAG ATC CCA CAT TAT 1008 
lie Gly Thr His Val lie His His Leu Phe Pro Gin lie Pro His Tyr 
325 330 335 

CAT CTA GTA GAA GCA ACA GAA GCA GCT AAA CCA GTA TTA GGG AAG TAT .1056 
His Leu Val Glu Ala Thr Glu Ala Ala Lys Pro Val Leu Gly Lys Tyr 
340 345 350 

TAT AGG GAG CCT GAT AAG TCT GGA CCT TTG CCA TTA CAT TTA CTG GGA 1104 
Tyr Arg Glu Pro Asp Lys Ser Gly Pro Leu Pro Leu His Leu Leu Gly 
355 360 365 

ATC TTA GCA AAA AGT ATT AAA GAA GAT CAT TTT GTG AGC GAT GAA GGA 1152 

lie Leu Ala Lys Ser lie Lys Glu Asp His Phe Val Ser Asp Glu Gly 
370 375 380 

GAT GTT GTA TAC TAT GAA GCA GAC CCT AAT CTC TAT GGA GAG ATC AAG 1200 
Asp Val Val Tyr Tyr Glu Ala Asp Pro Asn Leu Tyr Gly Glu lie Lys 
385 390 395 400 

GTA ACA GCA GAG TGAAATGAAG CTGTCAGATT TATCTATTTC TGACCA6CTG 1252 
Val Thr Ala Glu 
^ 405 

jiJTTTTTTTG CTTATTAATG TCAATTCATT GTGTTACCAT TATCTCTGAA TACAATCAGA 1312 
TGGAAACCCC AACTTTGTTT TCAATACTTG AAGCTATATA TATATATATA TATGTAAGAT 1372 



ACATTGTATT GTCATTAGAT TCACCATTCT CAAGGTTCTT ATACAAAAAA AAAAAAA 



1429 
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(2) ZNFOXOfilTXON FOR SBQ ID NO: 9: 

CI) SEQX7ENCB CHARACTERISTICS; 

(A) LENGTH: 404 amino acids 

(B) TXPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TXPE: protein. 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Phe Lys Phe Arg Gin Ser Pro Ser Ser Pro Argr Phe Arg Leu Aan Ser 
15 10 15 

Arg Asn Trp Ala Leu Asn Val Thr Thr Pro Leu Thr Val Asp Ser Ser 
20 25 

Ser Ser Pro Pro lie Glu Glu Glu Pro Lys Thr Gin Arg Phe Asp Pra 
35 40 45 

Gly Ala Pro Pro Pro Phe Asn Leu Ala Asp He Arg Ala Ala He Pro 
50 55 ^0 

Lvs His CVS Trp Val Lys Asn Pro Trp Lys Ser Met Ser Tyr Val Val 
65 70 75 80 

Arg Glu Leu Ala He Val Phe Ala Leu Ala Ala Gly Ala Ala Tyr Leu 
85 90 95 

Asn Asn Trp Leu Val Trp Pro Leu Tyr Trp He Ala Gin Gly Thr Met 
100 105 110 

Phe Trp Ala Leu Phe Val Leu Gly His Asp Cya Gly His Gly Ser Phe 
1X5 120 125 

Ser Asn Asp Pro Arg Leu Asn Ser Val Val Gly His Leu Leu His Ser 
130 135 140 

Ser He Leu Val Pro Tyr His Gly Trp Arg He Ser His Arg Thr His 
145 150 155 ISO 

His Gin Asn His Gly His Val Glu Asn Asp Glu Ser Trp His Pro Met 
155 170 175 

Ser Glu Lys He Tyr Lys Ser Leu Asp Lys Pro Thr Arg Phe Phe Arg 
180 185 190 

Phe Thr Leu Pro Leu Val Met Leu Ala Tyr Pro Phe Tyr Leu Trp Ala 
195 200 205 

Arg Ser Pro Gly Lys . Lys Gly Ser His Tyr His Pro Asp Ser Asp Leu 
210 215 220 

Phe Leu Pro Lys Glu Arg Asn Asp Val Leu ttr Ser Thr Ala C^r^ Trp 
225 230 235 240 
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Thr Ala Met Ala Val Leu Leu Val Cys Leu Asn Phe Val Met Gly Pro 
245 250 255 

Met Gin Met Leu Lys Leu Tyr Val He Pro Tyr Trp He Asn Val Met 
260 265 270 

Trp Leu Asp Phe Val Thr Tyr Leu His His His Gly His Glu Asp Lys 
275 280 285 

Leu Pro Trp Tyr Arg Gly Lys Glu Trp Ser Tyr Leu Arg Gly Gly Leu 
290 295 300 

Thr Thr Leu Asp Arg Asp Tyr Gly Leu He Asn Asn He His His Asp 
305 310 315 320 

He Gly Thr His Val He His His Leu Phe Pro Gin He Pro His Tyr 
325 330 335 

His Leu Val Glu Ala Thr Glu Ala Ala Lys Pro Val Leu Gly Lys Tyr 
340 345 350 

Tyr Arg Glu Pro Asp Lys Ser Gly Pro Leu Pro Leu His Leu Leu Gly 
355 360 365 

He Leu Ala Lys Ser He Lys Glu Asp His Phe Val Ser Asp Glu Gly 
370 375 380 

Asp Val Val Tyr Tyr Glu Ala Asp Pro Asn Leu Tyr Gly Glu He Lys 
385 390 395 400 



Val Thr Ala Glu 



(2) ZMFOBMATZON FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 2181 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
<D} TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Glycine max 
<vii) IMMEDIATE SOURCE: 

(B) CLONE: pXFl 
(ix) FEATURE: 

(A) NAME/KEY: CDS 
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CB) ZiOCATZON: 



144 
855.. 1997 





SEQUENCE DESCRIPTION: SEQ ID NO: 10: 




ACAATAATaA 


ATCCATATTT 


TTATAATTAA AAGTAGTAGA TTACAGCGAT GCACT7^6A 


60 


AACaXaTXAa 


GTGGACTAAT 


TCTCCCT6GT CAAGCAA6AA AAAAACCAGC TATGACCCAA 


120 




GASTATACAC 


AGAATACTAG TAATTAACTA AGACTG6CTC TGCAATTGCC 


X80 




TTGGAGTAGC 


AGCCACCTGA GAA6ACACTA AGACCTAGAC TAGACCATAC 


240 


ATATGaAGAT 


TAACACGCTT 


ACATAACAAC ATAGGACACT AAGAAAACAC GGCTTACAGA 


300 


GAATCCAGCT 


GACTCTATAA 


GAG6GGTACT TCTGGAGATT AAAATTATCC GAATCACCTT 


360 


CCCACTGCGG 


CTGCTGACGT 


CAGCGAAAGT CAGAACCG2UI AGCGGC6AAG AACCTTCAGA 


420 


AGA6GAGGAA 


GCACTTCGAC 


CTTACAAGAG TTGTTGTCGT TGTTGTTGTC GTTCTCTGGC 


480 


GGAGAAGCGA 


GTTTGGATCG 


CGTTTTCCTC GGAGGCTTCT CGGTCTTCCC CTGTTTCTGC 


540 


AGCTCAGCCA 


G6CCCTCGCA 


AATGGCCTGA AGCTTGGCGT CAACGGCGGA ATGAAGAGGC 


600 


TAATACTCCC 


CGAAGICACC 


ACCGACGGAG GAACCCTGGT GXCGGAGGTT GGGGAAGTTG 


DDU 


AGCCTGGCGA 


AGICACCTCG 


GAGCTTGTAC GCGGCCTTGT GGTACGCCA6 AGCGGCTTCC 


720 


TCGGC66TGT 


CGAAGGTTCC 


CAGCCATAGC CTGGTCCGGA TXCTTCGGGA GTCTAATCTC 


780 


AGCCACCCAC 


TTCCCCCCT6 


AGAAAAGAGA GGAACCACAC TCTCTAAGCC AAAGCAAAAG 


840 


CA6CAGCAGC 


AGCA ATG GTT AAA GAC ACA AAG CCT TTA GCC TAT GCT GCC 
Met Val Lys Asp Thr Lys Pro Leu Ala Tyr Ala Ala 
1 5 XO 


890 



AAT AAT GGA TAC CAA CAA AAG GGT TCT TCT TTT GAT TTT GAT CCT AGC 
Asn Asn Gly Tyr Gin Gin Lys Gly Ser Ser Phe Asp Phe Asp Pro Ser 
15 20 25 

GCT CCT CCA CCG TTT AAG ATT GCA GAA ATC AGA GCT TCA ATA CCA AAA 
Ala Pro Pro Pro Phe Lys He Ala Glu He Arg Ala Ser He Pro J^a 
30 35 « 

CAT TGC TGG GTC AAG AAT CCA TGG AGA TCC CTC AGT TAT GTT CTC AGG 
His Cys Trp Val Lys Asn Pro Trp Arg Ser Leu Ser Tyr Val Leu Arg 
45 50 55 60 

GAT GTG CTT GTA ATT GCT GCA TTG GTS GCT GCA GCA ATT CAC TTC GAC 
ASP Val Leu val He Ala Ala Leu Val Ala Ala Ala He His Phe Asp 
55 ' 70 "5 

AAC TGG CTT CTC TGG CTA ATC TAT TGC CCC ATT CAA GGC ACA ATG TTC 
Asn Trp Leu Leu Trp Leu He Tyr Cys Pro He Gin Gly Thr Met Phe 
80 85 90 



938 



986 



1034 



1082 



1130 
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TGG 6CT CTC TTT GTT CTT GGA CAT GAT TGT GGC CAT GGA AGC TTT TCA 1178 
Trp Ala Xieu Phe Val Leu Gly His Asp Cys Gly His Gly Ser Phe Ser 
95 100 105 

GAT AGC CCT TTG CTG AAT AGC CTG GTG GGA CAC ATC TTG CAT TCC TCA 1226 
Asp Ser Pro Leu Leu Asn Ser Leu Val Gly His He Leu His Ser Ser 
110 115 120 

ATT CTT GTG CCA TAC CAT GGA TGG AGA ATT AGC CAC AGA ACT CAC CAT 1274 
He Leu Val Pro Tyr His Gly Trp Arg He Ser His Arg Thr His His 
125 130 135 140 

CAA AAC CAT GGA CAC ATT GAG AAG GAT GAG TCA TGG GTT CCA TTA ACA 1322 
Gin Asn His Gly His He Glu Lys Asp Glu Ser Trp Val Pro Leu Thr 
145 150 155 

GAG AAG ATT TAC AAG AAT CTA GAC AGC ATG ACA AGA CTC ATT AGA TTC 1370 
Glu Lys He Tyr Lys Asn Leu Asp Ser Met Thr Arg Leu He Arg Phe 
160 165 170 

ACT GTG CCA TTT CCA TTG TTT GTG TAT CCA ATT TAT TTG TTT TCA AGA 1418 
Thr Val Pro Phe Pro Leu Phe Val Tyr Pro He Tyr Leu Phe Ser Arg 
175 180 185 

AGC CCC GGA AAG GAA GGC TCT CAC TTC AAT CCC TAC AGC AAT CTG TTC 1466 
Ser Pro Gly Lys Glu Gly Ser His Phe Asn Pro Tyr Ser Asn Leu Phe 
190 195 200 

CCA CCC AGT GAG AGA AAA GGA ATA GCA ATA TCA ACA CTG TGT TGG GCT 1514 
Pro Pro Ser Glu Arg Lys Gly He Ala He Ser Thr I^eu Cys Trp Ala 
205 210 215 220 

ACC ATG TTT TCT CTG CTT ATC TAT CTC TCA TTC ATA ACT AGT CCA CTT 1562 
Thr Met Phe Ser Leu Leu He Tyr Leu Ser Phe He Thr Ser Pro Leu 
225 230 235 

CTA GTG CTC AAG CTC TAT GGA ATT CCA TAT TGG ATA TTT GTT ATG TGG 1610 
Leu Val Leu Lys Leu Tyr Gly He Pro Tyr Trp He Phe Val Met Trp 
240 245 250 

CTG GAC TTT GTC ACA TAC TTG CAT CAC CAT GGT CAC CAC CAG AAA CTG 1658 
Leu Asp Phe Val Thr Tyr Leu His His His Gly His His Gin Lys Leu 
255 260 265 

CCT TGG TAC C6C GGC AAG GAA TGG AGT TAT TTA AGA GGT GGC CTC ACC 1706 
Pro Trp Tyr Arg Gly Lys Glu Trp Ser Tyr Leu Arg Gly Gly Leu Thr 
270 275 280 

ACT GTG GAT CGT GAC TAT GGT TGG ATC TAT AAC ATT CAC CAT GAC ATT 1754 
Thr Val Asp Arg Asp Tyr Gly Trp He Tyr Asn He His His Asp He 
285 290 295 300 

GGC ACC CAT GTT ATC CAC CAT CTT TTC CCC CAA ATT CCT CAT TAT CAC 1802 
Gly Thr His Val Xle His His Leu Phe Pro Gin He Pro His Tyr His 
305 310 315 
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CTC GTT GAA GCG ACA CAA GCA GCA AAA CCA GTT 
I,eu Val Glu Ala Thr Gin Ala Ala Lys Pro val 
320 325 

CGT GAG CCA GAA AGA TCT GCG CCA TTA CCA TTT 
Arg Glu Pro Glu Arg Ser Ala Pro Leu Pro Phe 
335 340 

TTA ATT CA6 AGT ATG AGA CAA 6AC CAC TTC GTA 
I.eu lie Gin Ser Met Arg Gin Asp His Phe Val 
350 355 

GTT GTT TAT TAT CAG ACT GAT TCT CTG CTC CTC 
Val val Tyr Tyr Gin Thr Asp Ser lieu Leu Leu 
365 370 375 

TGAGTTTCAA ACTTTTTGGG TTATTATTTA TTGGATTCTA 

TTTAATGTTA TGTTTTTTGG AGTTTAACGT TTTCTGAACA 

GAGAGACATG GAATATTTAT TTGAAATTAG TAAGGTAGTA 

AGTTTCA 



CTT GGA GAT TAC TAC 1850 
Leu Gly Asp Tyr Tyr 
330 

CAT CTA ATA AAG TAT 1898 
His Leu He Lys Tyr 
345 

AGT GAC ACT GGA GAT 1946 

Ser Asp Thr Gly Asp 

360 

CAC TCG CAA CGA GAC 1994 
His Ser Gin Arg Asp 

3eo 

GCTACTCAAA TTACTTTTTT 2054 
ACTTGCAAAT TACTTGCATA 2114 
ATAATAAATT TTGAATTGtC 2174 

2181 



(2) IKFOBMATION FOR SEQ ID NO: 11: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 380 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECOLE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Met Val Lys Asp Thr Lys Pro Leu Ala Tyr Ala Ala Asn Asn Gly Tyr 



10 



Gin Gin Lys Gly Ser Ser Phe Asp Phe Asp Pro Ser Ala Pro Pro Pro 



20 



Phe Lys 



lie Ala Glu lie Arg Ala Ser lie Pro Lys His Qy9 Trp Val 



35 



40 



45 



Lys Asn Pro Trp Arg Ser Leu Ser Tyr Val Leu Arg Asp Val Leu Val 



50 55 60 



lie Ala Ala Leu val Ala Ala Ala lie His Phe Asp Asn Trp Leu Leu 
65 70 75 80 



Trp Leu He Tyr Cys Pro 



He Gin Gly Thr Met Phe Trp Ala Leu Phe 



85 



90 



95 



Val Leu 



Gly His Asp Cys Gly His Gly Ser Phe Ser Asp Ser Pro Leu 



100 



105 
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Leu Asn Ser Leu Val 61y His lie Leu His Ser Ser lie Leu Val Pro 
lis 120 125 

Tyr His Gly Trp Arg lie Ser His Arg Thr His His Gin Asn His Gly 
130 135 140 

His He Glu Lys Asp Glu Ser Tarp Val Pro Leu Thr Glu Lys He Tyr 
145 150 155 160 

Lys Asn Leu Asp Ser Met Thr Arg Leu He Arg Phe Thr Val Pro Phe 
165 170 175 

Pro Leu Phe Val Tyr Pro He Tyr Leu Phe Ser Arg Ser Pro Gly Lys 
180 185 190 

Glu Gly Ser His Phe Asn Pro Tyr Ser Asn Leu Phe Pro Pro Ser Glu 
195 200 205 

Arg Lys Gly He Ala He Ser Thr Leu Cys Trp Ala Thr Met Phe Ser 
210 215 220 

lieu Leu He Tyr Leu Ser Phe He Thr Ser Pro Leu Leu Val Leu Lys 
225 230 235 240 

Leu Tyr Gly He Pro Tyr Trp Xle Phe Val Met Trp I«eu Asp Phe Val 
245 250 255 

Thr Tyr Leu His His His Gly His His Gin Lys Leu Pro Trp Tyr Arg 
260 265 270 

Gly Lys Glu Trp Ser Tyr Leu Arg Gly Gly Leu Thr Thr Val Asp Arg 
275 280 285 

Asp Tyr Gly Trp He Tyr Asn He His His Asp He Gly Thr His Val 
290 295 300 

He His His Leu Phe Pro Gin Xle Pro His Tyr His Leu Val Glu Ala 
305 310 315 320 

Thr Gin Ala Ala Lys Pro Val Leu Gly Asp Tyr Tyr Arg Glu Pro Glu 
325 330 335 

Arg Ser Ala Pro Leu Pro Phe His Leu He Lys Tyr Leu He Gin Ser 
340 345 350 

Met Arg Gin Asp His Phe Val Ser Asp Thr Gly Asp Val Val Tyr Tyr 
355 360 365 

Gin Thr Asp Ser Leu Leu Leu His Ser Gin Arg Asp 
370 375 ^ 380 

(2) INFOX^TION FOR SBQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1675 base pairs 

(B) TYPE: nucleic acid 
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(C) STHANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(vi) ORZ6IN2U; SOURCE: 

(A) ORGANISM: Glycine max 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: pSFD-118bwp 
FEATURE: 



(iac) 



(A) NAME/KEY: CDS 

(B> LOCATION: 169.. 1530 



(aci) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
CTGTGGCAAT TTTTCTCTTC TCCTTCTGGT TCTCATCTTT GTGTTCTTCT TTGTTTCTCA 
CCTTTCTGAG GATTTTTCCA TCTTAGTTCC TGCSIGGCACC AG6AACCTGA CCAAATAAAT 
AAACCTTTTT TTTCTTCTAA TTTTTCTGAA GTTTCATTTT «A<5TCCA MG GCA ACT 

1 

TG© TAT CUT CW5 AAA TGT GGC TTG AAG CCT CTT GCT CCA GTA ATT OCT 
Trp Tyr Bis Gin Lys Cys Gly I«u I-ys Pro Leu Ala Pro Val He Pro 
5 10 

AGA CCT AGA ACT GGG GCT GCT TTG ICC AGC ACC TCA AGG GTT GAA TTT 
^ So ^Jhi ^ Ala Ala Lett Ser Ser Thr Se= Arg Val Glu Phe 
20 25 30 

TTG GAC ACA AAC AAG GTA GTG GCA GGT CCT ARG TTT CAA CCT TTG AGG 
^ isn Lys val Val Ala Gly Pro Lys Phe Gin Pro Lett Arg 

TGC AAC CIC AGG GAG AGG AAT TGG GGG CTG AAA GTG AGT GCC CCT TTG 
S Sn Su Arg Gltt Arg Asn Trp ay Leu Lys Val S«r Ala Pro Leu 

AGG GTT GCT TCC ATT GftA GAG GAG CAA AAG AGT GTT GAT . TTA ACC AAT 
?S 2a ser He Glu Glu Glu Gin !*» Ser Val Asp Lett Thr Asn 
70 

GGG ACT AAT GGG GTT GAG CAT GAG «XG CTT CCA GAA TTT GAC CCT GGT 
§S ?S S Val Glu His Glu Lys Leu Pro Glu Phe Asp Pro Gly 
as 90 95 

«r-.. f-rr rea CCA TTC AAC TTG GCT GAT ATT AGA GCA GCC ATT CCA AAG 
S S S S ?SS ^ Lu Ala ASP He Arg Ala Ala He Pro Lys 

100 



60 
120 
17*7 

225 

273 

321 

369 

417 

465 

513 



4 
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CAT TGC TGG GT6 AAG GAC CCT T6G AGG TCC ATG AGC TAT GTG GTG AGG 561 
His Cys Trp Val Lys Asp Pro Trp Arg Ser Met Ser Tyr Val Val Arg 
120 125 130 

GAT GTG ATT GCT GTC TTT GGT TTG GCT GCT GCT GCT GCG TAT CTC AAT 609 
Asp Val He Ala Val Phe Gly Leu Ala Ala Ala Ala Ala Tyr Leu Asn 
135 140 145 

AAT TGG TTG GTT TGG CCT CTC TAT TGG GCT GCT CAA GGC ACT ATG TTC 657 
Asn Trp Leu Val Trp Pro Leu Tyr Trp Ala Ala Gin Gly Thr Met Phe 
150 155 160 

TGG GCT CTG TTT GTT CTT GGT CAT GAT TGT GGT CAT GGA AGC TTT TCA 705 
Trp Ala Leu Phe Val Leu Gly His Asp Cys Gly His Gly Ser Phe Ser 
165 170 175 

AAC AAC TCC AAA TTG AAC AGT GTT GTT GGA CAT CTG CTG CAT TCT TCA 753 
Asn Asn Ser Lys Leu Asn Ser Val Val Gly His Leu Leu His Ser Ser 
180 185 190 195 

ATT CTA GTG CCA TAT CAT GGA TGG AGA ATC AGT CAT AGG ACT CAT CAC 801 
He Leu Val Pro Tyr His Gly Trp Arg He Ser His Arg Thr His His 
200 205 210 

CAA CAT CAT GGT CAT GCT GAA AAT GAT GAA TCA TGG CAT CCG TTG CCT 849 
Gin His His Gly His Ala Glu Asn Asp Glu Ser Trp His Pro Leu Pro 
215 220 225 

GAA AAA TTG TTC AGA AGC TTG GAC ACT GTA ACT CGT ATG TTA AGA TTC 897 
Glu Lys Leu Phe Arg Ser Leu Asp Thr Val Thr Arg Met Leu Arg Phe 
230 235 240 

ACA GCA CCT TTT CCA CTT CTT GCA TTT CCT GTG TAC CTT TTT AGT AGG 945 
Thr Ala Pro Phe Pro Leu Leu Ala Phe Pro Val Tyr Leu Phe Ser Arg 
245 250 255 

AGT CCT GGG AAG ACT GGT TCT CAC TTT GAC CCC AGC AGT GAC TTG TTC 993 
Ser Pro Gly Lys Thr Gly Ser His Phe Asp Pro Ser Ser Asp Leu Phe 
260 265 270 275 

GTT CCC AAT GAA AGA AAA GAT GTT ATT ACT TCC ACA GCT TGT TGG GCT 1041 
val Pro Asn Glu Arg Lys Asp Val He Thr Ser Thr Ala Cys Trp Ala 
280 285 290 

GCT ATG TTG GGA TTG CTT GTT GGA TTG GGG TTT GTA ATG GGT CCA ATT 1089 
Ala Met Leu Gly Leu Leu Val Gly Leu Gly Phe Val Met Gly Pro He 
295 300 305 

CAA CTT CTT AAG CTT TAT GGT GTT CCC TAT GTT ATA TTC GTT ATG TGG 1137 
Gin Leu Leu Lys Leu Tyr Gly Val Pj;o Tyr Val He Phe Val Met Trp 
310 315 320 

TTG GAT TTG GTG ACT TAT TTG CAC CAT CAT GGC CAT GAA GAC AAA TTA 1185 
lieu Asp Leu Val Thr Tyr Leu His His His Gly His Glu Asp Lys Leu 
325 330 335 
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CCT TGG TAC CGT GGA AAS GRA TGG AGC TAC CTC AGG GGT GGT CTA ACT 1233 
Pro Trp Tyr Arg Gly Lys Glu Trp Ser Tyr £eu Arg Gly Sly I«u 
340 345 350 355 

ACT CTT GAT CGT GAT TAT GGA TGG ATC AAT AAC ATT CAC CAT 6AC ATT 1281 
Thr Leu Asp Arg Asp lyr Gly Trp lie Asn Asn lie His His Asp lie 
360 365 370 

GGC ACT CAT 6TC ATI CAT CAC CTA TTT CCT CAA ATT CCA CAC TAT CAC 1329 
GlY Thr His Val He His His Leu Phe Pro Gin He Pro His Tyr His 
' 375 380 385 

TTA GIT GAG GCT ACT GAG GCT GCT AA6 CCA GIG TTT GGA AAA TAT TAT 1377 
Leu val Glu Ala Thr Glu Ala Ala Lys Pro val Phe Gly Lys Tyr Tyr 
390 395 400 

AGA GftA CCA AAG AAA TCA GCA GCA CCT CTT CCT TTT CAC CTT ATT GGG 1425 
Arg Slu Pro Lys Lys Ser Ala Ala Pro Leu Pro Phe His Leu lie Gly 
405 410 

GAA ATA ATA AGG AGC TTC AAG ACT GftC CAT TTT GTT AGT 6AC MG GGG 1473 
Glu He He Arg Ser Phe lys Thr Asp His Phe Val Ser Asp Thr Gly 
420 425 430 435 

GAT GTT GTG TAC TAT CAA AGO GAC TCT AAG ATT AAT GGC TCT TCC AAA 1521 
ASP Val val Tyr Tyr Gin Thr Asp Ser Lys He Asn Gly Ser Ser Lys 
440 445 450 

TTA GAG TGAATATTAA AATTCTTTTC TATATftGACA AGIMSAGGCTT ATACACAATT 1577 
Leu Glu 

CTTATTGCTT TAAASATTGT CTTGikGTTTC TCCGaAAGTT ACTGCACTTA CTTGGAGTTG 1637 
AATCCITCAT ZAATAAAGGG ATG6ATGGAT CATAXAAA 

C2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQOEHCE CHARACTERISTICS: 

(A) LENGTH: 453 amino acids 

(B) TXPE: amino acid 
(D) TOPOL06!r: linear 

(ii) ttOLBCCJLB SZPE: protein 

(aci) SEQDENCE DESCRIPTIOM; SBQ ZD SO: 13: 

Met Ala Thr Trp Tyr His Gin Lys pys Gly Leu Lys Pro Leu Ala Pro 

1 5 . 10 " 

val He Pro Arg Pro Arg Thr Gly Ala Ala Leu Ser Ser Thr Ser Arg 
20 25 3U 

val Glu Phe Leu Asp Thr Asn Lys Val Val Ala Gly Pro Lys Phe Gin 
35 40 45 



1675 
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Pro Leu Arg Cys Asn Leu Arg Glu Arg Asn Trp Gly Leu Lys Val Ser 
50 55 60 

Ala Pro Leu Arg Val Ala Ser Zle Glu Glu Glu Gin Lys Ser Val Asp 
65 70 75 80 

Leu Thr Asn Gly Thr Asn Gly Val Glu His Glu Lys Leu Pro Glu Phe 
85 90 95 

Asp Pro Gly Ala Pro Pro Pro Phe Asn I«eu Ala Asp lie Arg Ala Ala 
100 105 110 

lie Pro Lys His Cys Tzp Val Lys Asp Pro Trp Arg Ser Met Ser Tyr 
115 120 125 

Val Val Arg Asp Val Zle Ala Val Phe Gly Leu Ala Ala Ala Ala Ala 
130 135 140 

Tyr Leu Asn Asn Trp Leu Val Trp Pro Leu Tyr Trp Ala Ala Gin Gly 
145 150 155 160 

Thr Met Phe Trp Ala Leu Phe Val Leu Gly His Asp Cys Gly His Gly 
165 170 175 

Ser Phe Ser Asn Asn Ser Lys Leu Asn Ser Val Val Gly His Leu Leu 
ISO 185 190 

His Ser Ser lie Leu Val Pro Tyr His Gly Trp Arg lie Ser His Arg 
195 200 205 

Thr His His Gin His His Gly His Ala Glu Asn Asp Glu Ser Trp His 
210 215 220 

Pro Leu Pro Glu Lys Leu Phe Arg Ser Leu Asp Thr Val Thr Arg Met 
225 230 235 240 

Leu Arg Phe Thr Ala Pro Phe Pro Leu Leu Ala Phe Pro Val Tyr Leu 
245 250 255 

Phe Ser Arg Ser Pro Gly Lys Thr Gly Ser His Phe Asp Pro Ser Ser 
260 265 270 

Asp Leu Phe Val Pro Asn Glu Arg Lys Asp Val lie Thr Ser Thr Ala 
275 280 285 

Cys Trp Ala Ala Met Leu Gly Leu Leu Val Gly Leu Gly Phe Val Met 
290 295 300 

Gly Pro lie Gin Leu Leu Lys Leu Tyr Gly Val Pro Tyr Val lie Phe 
305 310 315 320 

Val Met Trp Leu Asp Leu Val Thr Tyr Leu His His His Gly His Glu 
325 330 335 

Asp Lys Leu Pro Trp Tyr Arg Gly Lys Glu Trp Ser Tyr Leu Arg Gly 
340 345 350 
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Gly Leu Thr Thr Xieu Asp Arg Asp Tyr Gly Trp IX Asn Asn lie His 
355 360 365 

His Asp II Gly Thr His Val He His His I.eu Phe Pro Gin He Pr 
370 375 380 

His Tvr His I.eu Val Glu Ala Thr Glu Ala Ala Lys Pro Val Phe Gly 
385 390 395 400 

Lys Tyr Tyr Arg Glu Pro Lys Lys Ser Ala Ala Pro Leu Pro Phe His 

■ 410 415 



405 



Leu He Gly Glu He He Arg Ser Phe Lys Thr Asp His Phe Val Ser 
420 425 430 

ASP Thr Gly Asp Val Val Tyr Tyr Gin Thr Asp Ser Lys He Asn Oly 

MMe% 445 



435 

Ser Ser Lys Leu Glu 
450 



440 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 396 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS : single 

(D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Zea mays 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: pPCR20 
FEATURE: 



(ix) 



(A) NRME/KEY: exon 

(B) LOCATION: 31.. 363 



Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
GGATCCAC6C ATCATCAGAA TCACG6TCAC ATCCACAGGG ACGRGTCATG GCACCC6ATC 
ACGGAGAAGC TGTACCGGCA ACTAGAGCCA CGCACCAAGA AGCTGAGATT CACGGTGCCC 
TTCCCCCTGG TCGCATTCCC CGTCTACCTC TTGTACAGGA GCCOCGGCAA GCTCGGCTCC 
CACTTCCTTC CCAQCAGCGA CCTGTTCAGC CCCAAGGAGA AGAGCGACGT CATGGTGTCA 



60 
120 

180 j| 
Z40 
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ACCACCTGCT GGTGCATCAT GCTCGCCTCC CTCCTCGCCA TGGCGTGCGC GTTCGGCCCA 300 

CTCCUGGTGC TCAAGHTGTA CGGCATCCCA TACCTGGTGT TCGTGATGTG GCTTGACCT6 360 

GTGACGTACT TACATCACCA C66CCACGAT GGATCC 396 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQX7ENCE CHARACTERISTICS: 

(A) IiENGTH: 126 amino acids 

(B) TYPE: amino acid 

CO STRANDEDNESS : unknown 
(D) TOPOLOGY: unknown 

(ii) MOLECOLE TYPE: protein 

(iii) HYPOTHETICAL: YES 

<v) FRAGMENT TYPE: internal 

(vi) ORIGINAL S0X7RCE: 

(A) ORGANISM: Zea mays 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: pPCR20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

His His Gin Asn His Gly His lie His Arg Asp Glu 
15 10 

lie Thr Glu Lys Leu Tyr Arg Gin Leu Glu Pro Arg 
20 25 

Arg Phe Thr Val Pro Phe Pro Leu Leu Ala Phe Pro 
35 40 

Tyr Arg Ser Pro Gly Lys Leu Gly Sex His Phe Leu 
50 55 60 

Leu Phe Ser Pro Lys Glu Lys Ser Asp Val Met Val 
65 70 75 

Trp Cys lie Met Leu Ala Ser Leu Leu Ala Met Ala 
85 90 

Pro Leu Gin Val Leu Lys Met Tyr Gly He Pro Tyr 
100 105 

Met Trp Leu Asp Leu Val Thr Tyr Leu His His His 
115 120 



Ser Trp His Pro 
15 

Thr Lys Lys Leu 
30 

Val Tyr Leu Leu 
45 

Pro Ser Ser Asp 



Ser Thr Thr Cys 
80 

Cys Ala Phe Gly 
95 

Leu Val Phe Val 
110 

Gly His 
125 



W093/I1245 PCr/US92/10284 

154 

(2) XKFORM&TION FOR SEQ ID NO: 16: 

(1) SEQUENCE CHARACTERISTICS: 

(A) I.ENGTH: 472 base pairs 

(B> TYPE: nucleic add ^ 

(C) STRAMDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) HOZSCOLE TYPE: CDNA ^ 
(iii) RYPOTHETICAL: NO 
(vi) ORZCTNAL SOURCE: 

(A) ORGANISM: Arabidopsis t;halian« 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: pFadx-2 and pYacp7 
Caci) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

CCTCGAGCTA CGTCAGC5GCT AAAACCAGGA ACT6GGCATT GAATGTGGCA ACACCTTTAA 
CAACTCXTCA GTCTCCATCC GAGGAAGACA GGGAGAGATT CGAOCCAGGT GCGCCTCCTC 
CCTTCAATTT GGCGGATATA AGAGCAGCCA TACCTAAGCA TTGTTGGGTT AAGAATCCAT 
GGATGTCTAT GAGTTATGTT GTCAGAGATG TTGCTATCGT CTTTGGATTG GCTGCTGTTG 
CTGCTTACTT CAACAATTGG CTTCTCTGGC CTCTCTACTG GTTCGCTCAA GGAACCATGT 
TCTGGGCTCT CTTTGTCCTT GGCCATGACT GCGGACATGG TAGCTTCTCG AATGATCCGA 
GGCTGAACAG TGTGGCTG6T CATCTTCTTC ATTCCTCAAT CCTGGTCCCT TACCATGGCT 



60 
120 
160 
240 
300 
360 
420 
472 



GGAGGATTAG CCACAGAACT CACCACCAGA ACCATGGTCA TGTCGAGAAT GA 

(2) INFOBMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTEEaSTICS: 

(A) LENGTH: 156 amino acida 

CB) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknQ«m 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: YES ^ 
(v) FRAGMENT TYPE: N-terminal 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Arabidopsis thaliana 
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(vli) IMMEDIATE SOURCE: 

(B) CLONE: pFadx-2 and pYacpT 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Ser Ser Tyr Val Arg Ala Lya Thr Arg Asn Trp Ala Leu Asn Val Ala 
15 10 15 

Thr Pro Leu Thr Thr Leu Gin Ser Pro Ser Glu Glu Asp Arg Glu Arg 
20 25 30 

Phe Asp Pro Gly Ala Pro Pro Pro Phe Asn Leu Ala Asp lie Arg Ala 
35 40 45 

Ala lie Pro Lys His Cys Trp Val Lys Asn Pro Trp Met Ser Met Ser 
50 55 60 

Tyr Val Val Arg Asp Val Ala lie Val Phe Gly Leu Ala Ala Val Ala 
65 70 75 80 

Ala Tyr Phe Asn Asn Trp Leu Leu Trp Pro Xjeu Tyr Trp Phe Ala Gin 
85 90 95 

Gly Thr Met Phe Trp Ala Leu Phe Val Leu Gly His Asp Cys Gly His 
100 105 110 

Gly Ser Phe Ser Asn Asp Pro Arg Leu Asn Ser Val Ala Gly His Leu 
115 120 125 

Leu His Ser Ser lie Leu Val Pro Tyr His Gly Trp Arg lie Ser His 
130 135 140 

Arg Thr His His Gin Asn His Gly His Val Glu Asn 
145 150 155 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(d) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/KEY: misc feature 

(B) LOCATION: l,.ll 

<D) OTHER INFORMATION: /note= "N« INOSINE" 



(ix) FEATURE: 



(A) NAME/KEY: xnisc feature 

(B) LOCATION: 12.. 31 

(D) OTHER INFORMATION: /note- "N- A OR T OR G OR C" 
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Cxi) SEQUENCE OESCRIPTIONi SEQ ID NO: 18: 
CGGGATCCAC NCAYCAYCAR AAYCAYGOIC A 

(2) INFORMATION FOR SEQ ID NO: 19: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
(P) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/KEY: misc feature 

(B) LOCATION: 1..15 

(D) OTHER INFORMATION: /note« "N- INOSINB 

(ix) FEATURE: 

(A) NAME/KEY: misc feature 

fBl LOCATION: 16.. 35 _ ^„ 

(D) OTHER INFORMATION! /note- "N- A OR T OR G OR C 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

CGGGATCCRT CRTGNCCRTG RT<»T<a«ARR TANGT 

(2) INFORMATION FOR SEQ ID NO:20: 

(i) SEQUENCE CHARACTERISTICS r 

CA) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(iac) FEATURE: 

(A) NAME/KEY: misc feature 
f B) LOCATION : 1 . . 36 

(D) OTHER INFORMATION: /note- "N- INOSINB" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 
TTCGTNNTNG GNCAYGftYTG YGGNCAYGGN CAYGGNAGNT TC 

(2) INFORMATION FOR SEQ ID NO:21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNSSS : single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/KEY: misc feature 

(B) LOCATION: l.«36 

(D) OTHER INFORMATION: /note« "N= INOSINE" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
TTCGTNNTNG GNCAYGAYTG YGGNCAY6GN TCNTTC 36 



(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 
CO STRANDEDNESS: single 
CD ) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
GGHCAYGAYT GYGGHCAC 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 base pairs 

(B) TYPE: nucleic acid 
CO STRANDEDNESS: single 
CD) TOPOLCXSY: linear 

Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

GGHCAYGAYT GYGGHCAT 



C2) INFORMATION FOR SEQ ID NO: 24: 

CD SEQUENCE CHARACTERISTICS: 

(A) LENC3TH: 18 base pairs 

CB) TYPE: nucleic acid 

CO STRANDEDNESS: single 

CD) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

GTACTRTARC CDTGDGTR ^® 
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(2) ZMFOBMimON FOR SEQ ZD NO: 25: 

(1) SEQUENCE CHARACTERZSTZCS : 

(A) LENGTH: 18 base pairs 

(B) TXPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOG!^: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
GTGCTRTARC CDTGDGTR 



{Zy INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) Z£N6TH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0i26t 
GTRCANTARG TRGTRAAYAA YGG 



(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

CA) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0s27: 
GTRCANTADG TRGTR6ADAA YGG 



(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

CA) NAME/KEY: misc feature 
(B) LOCATION: 1..36 

(D) OTHER INFORMATION: /note= "^N* INOSINE 
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(ad.) SEQUBNCS DESCRIPTION: SEQ ID NO: 28: 
TTC6TNNTN6 GNCHYGilYTG Y6GNCAYQGN A6NTTT 

(2) INFORM21TZON FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDBDNESS : single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/KEY: misc feature 

(B) LOCATION: 1..36 

(D) OTHER INFORMATION: /note= '•N- INOSINE" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
TTCGTNNTNG QICAYGAYTG Y6GNCAYGGN TCNTTT 

(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/KEY: misc feature 

(B) LOCATION: 1..38 

(D) OTHER INFORMATION: /note= "N= INOSINE" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
GTRCTRTANC CNTGNGTNCA NTANGTAGTG RANAAGGG 

(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 38 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/KEY: misc feature 

(B) LOCATION: 1..38 
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CD) OTHER INFORMATION: /no%&^ -N« INOSINE" 
(xi) SEQUENCE DESCRIPTION t SEQ ID NO: 31: 
GTRCTRTANC CNTGNGTNCA NTANGTGGTG RAN3UM3GG 

(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: ^ 

(A) LENGTH: 138 base pairs 

<B) TYPE: nucleic acid 

(C) STRANDEDNESS : singXe 

(D) TOPOLOGY: linear 

lix) FEATURE: 

(A) NAME/KEY: misc feature 

(B) LOCATION: 1-.135 

(D) OTHER INFORMATION? /note» "N« INOSINE 
(xi) SEQUENCE DESCRIPTION: SEQ ID N0r32: 
GTGGTGN6TN CNGTGNGANA NNCKCCANCC GTGGTANGGN ACNANNANGA ANGANGAdTG 60 
NANNANGTGN CCNACNANNG AGTTNANNAN NGGNATNTCN GAGAANGA^ "0 

138 

NTCGT6NCCN ANNACGAA 
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1. An Isolated nucleic acid fragment comprising a 
nucleic acid sequence encoding a fatty acid desaturase 
or a fatty acid desaturase-related enzyme with an amino 

5 acid identity of 50% or greater to the polypeptide 
encoded by SEQ ID N0S:1, 4, 6, 8, 10, 12, 14 or 16. 

2. The isolated nucleic acid fragment of Claim 1 
wherein the amino acid identity is 65% or greater to the 
polypeptide encoded by SEQ ID N0S:1, 4, 6, 8, 10, 12, 14 

10 or 16. 

3. The isolated nucleic acid fragment of Claim 1 
wherein the nucleic acid identity is 90% or greater to 
SEQ ID NOS:l, 4, 6, 8, 10, 12, 14 or 16. 

4 . An isolated nucleic acid fragment of Claim 1 
15 wherein said fragment is isolated from a plant selected 

from the group consisting of soybean, oilseed BrassiCfl 
species, Arabidopsis t-haliana and corn. 

5. A chimeric gene capable of causing altered 
levels of linolenic acid in a transformed plant cell, 

20 the gene conqprising a nucleic acid fragment of any of 
Claims Ir 2, or 3, the fragment operably linked to 
suitable regulatory sequences. 

6- Plants containing the chimeric genes of 
Claim 5 . 

25 7. Oil obtained from seeds of the plants 

containing the chimeric genes of Claim 5. 

8. A method of producing seed oil containing 
altered levels of linolenic (18:3) acid comprising: 

(a) transforming a plant cell of an oil- 
30 producing species with a chimeric gene of Claim 5; 

(b) growing fertile plants from the 
transformed plant cells of step (a) ; 

(c) screening progeny seeds from the fertile 
plants of step (b) for the desired levels of linolenic 

35 (18:3) acid; and 
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(d) processing the progeny seed of step (c) 
to obtain seed oil containing altered levels of 
linolenic (18:3> acid. 

9 The product of the method of Claim 8 . 
5 10. A method of Claim 8 wherein said plant cell of 

an oil-producing species is selected from the group 
consisting of i^ir^y^if^nps±& thalianflr soybean^ oilseed 
T^T-agsica species, sunflower / cotton, cocoa, peanut/ 
saf flower, and com. 
10 11- A method of breeding plant species producing 

altered levels of linolenic acid in the seed oil of oil- 
producing plant species comprising: 

(a) making a cross between two varieties of 
oil-producing species differing in the linolenic acid 

15 trait; 

(b) making a Southern blot of restriction 
enzyme digested genomic DNA isolated from several 
progeny plants resulting from the cross of step (a) / and 

(c) hybridizing the Southern blot with a 
20 radiolabelled nucleic acid fragment of Claim 1. 

12. The product of the method of Claim 11 i. 

13 . A method of RFLP mapping in a genomic RFLP 

marker comprising: 

(a) making a cross between two varieties of 

25 plants; 

(b) making a Southern blot of restriction 
enzyme digested genomic DNA isolated from several 
progeny plants resulting from the cross of step (a) ; and 

(c) hybridizing the Southern blot with a 
30 radiolabelled nucleic acid fragments of Claim 1. 

14. A method to isblate. nucleic acid fragments 
encoding fatty acid desaturases and fatty acid 
desaturase-related enzymes, comprising: 
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(a) comparing SEQ ID N0S:2, 5, 7, 9, 11, 13, 
15 and 17 with other fatty acid desaturase polypeptide 
sequences; 

(b) identifying the conserved sequence (s) of 
5 4 or more amino acids obtained in step (a) ; 

(c) making region-specific nucleotide 

probe (s) or oligomer (s) based on the conserved sequences 
identified in step b; and 

(d) using the nucleotide probe (s) or 

10 oligomers (s) of step c to isolate sequences encoding 

fatty acid desaturases and fatty-acid desaturase-related 
enzymes by sequence-dependent protocols. 

15. The product of the method of Claim 14, 

16. The isolated genomic DNA of Arabidopsis 
15 thaliana identified by accession number ATCC 75167. 

17. An isolated cDNA clone which encodes for 
soybean delta-15 desaturase, the clone designated pXFl 
coiaprlslng the DNA sequence of SEQ ID KO 10 and 
identified by accession number ATCC 68874. 

20 18. An isolated cDNA clone which encodes for 

oilseed Brassica species delta-15 desaturase, the clone 
designated pBNSF3 comprising the DNA sequence of SEQ ID 
NO: 6 and identified by accession number ATCC 68854. 

19. An isolated Polymerase Chain Reaction Product 

25 for zea mays delta-15 desaturase, the clone designated 
pcr20 comprising the DNA sequence of SEQ ID NO: 14. 



INTERNATIONAL SEARCH REPORT 

1 AyyUcMten Wo 



^CT/US 92/10284 



L CLASSIFICATION OF SUBJECT MATTER (If sn«nl ctesifletttaa symtali 






AcflBiiing to iHttraaltoMi Pwaam riiiitflfitiaD gPQ ar to both Ntttnoii CUntft 

Int. CI. 5 C12N15/53; C12N15/82; 


■llfiouilPC 

CllBl/00; 


C12qi/68 


a. ncLDS sEAScasD 




f milfif Miloii Sjnbols 


Int.CI. 5 


C12N ; CllB ; 


C12Q 















nL DOOUMEPOS CCmSIDlBED TO BB BELEVANT* 



I iiiMro Appiu|MiitOi of lllO f 



NoJU 



UCLA SYMP. MOL. CELL. BIOL.; NEW SER., 
PLANT GENE TRANSFER 
vol. 129> 1990» 
pages 301 309 

BROWSE, J., ET AL. 'Strategies for 
modifying plant lipid composition' 
see the whole document 



SCIENCE 

vol. 252, 5 April 1991, LANCASTER. PA US 
pages 80 - 87 

SOHERVILLE, C. ET AL. 'Plant lipids: 
Metabolism, nutants, and menbranes' 
see page 82, right column, line 24 - line 
27 



7,11.14 



2-6,8, 
10,15 

2-6,8, 
10,15 





IV. arancAiioN 



17 MARCH 1993 



Pitt T* f^*BI»g ^ tM« IwtawiBrifi— I 

,9. 03.93 



SttfdriBg Autfaorily 

CUR PATENT FRCE 



Sigpttofo of Antfaorixoi Offlev 

MAODOX A.D. 



1MB FcrnsA/no MBi MiMAMnr tm 



PCT/US 92/10284 




P.X 



THEOR. APPU. GENET. 

vol. 80, no; 2, 1990, 

pages 234 - 240 . 

LEMIEUX. B., ET AL. 'Mutants of 

Arabldopsis with alterations In seed lipid 

fatty acid composition' 

see the whole document ^ 

vol f 258, 20 November 1992, LANCASTER. PA- 
US 

oaoes 13S3 - 1355 . , ^ 

aSnDEL. v.. ET AL. 'Map-based clon ng of 
a gene controlling omega-3 fatty acid 
desaturatton fn Arabldopsis 
see the whole document 

Ml^^if i9a6. RO^VILLE. HD. USA. 
pages 859 - 864 

BROWSE, J., ET AL. 'A mutant of 
Arabldopsis deficient in C18:3 and C16:3 
leaf lipids' 
see the whole document 

WO. A, 9 113 972 (CALGEME) 
19 September 1991 
see the whole document 



7.9,11 



1-13,15 



lb 



1-12 



l-IO 



ANNEX TO THE INTERNATIONAL SEARCH REPOBT 0910284 
ON INTERNATIONAL PATENT APPLICATION NO. "J SZIOZBJ^^ 



i lists the pstEnt Haaii manben ixiatmtto tlie patent docntacntx cited in ttaaaliuwi uiiiirinmili 

ItfalM^SSfef'SLfS^ti^'^.feiM 17/03/93 







Patent fittily 




dted Bcscii report 


dste 


HlflllllCI^S) 




WO-A-9113972 


19-09-91 


EP-A- 0472722 


04-03-92 



No. 12y82 



WORLD INTELLECTUAL PROPERTY ORGANIZATION 
International Bureau 




per 

INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(51) International Patent Qassification 5 : 

C12N 15/53, 15/82, CllB 1/00 
C12Q 1/68, AOIH 5/00 



Al 



(11) International Publication Namber: 
(43) International Publication Date: 



WO 94/11516 

26 May 1994(26.05.94) 



(21) International Application Number: PCT/US93/09987 

(22) International Hling Date: 15 October 1993 (15.10.93) 



(30) Priority data: 

07/977,339 



17 November 1992 (17.1 1.92) US 



(60) Parent Application or Grant 
(63) Related by Continuation 

US 07/977,339 (CIP) 

Filed on 17 November 1992(17.11.92) 



(71) Applicant (for all designated States except US): E.I. DU 
PONT DE NEMOURS AND COMPANY [US/US); 
1007 Market Street, Wilmington,* DE 19898 (US). 



(72) Inventors; and 

(7^ Inventors/Applicants (for US only) : LIGHTNER, Jonathan, 
Edward [US/US]; 438 East Market Street, Marietta, PA 
17547 (US). OKULEY, John, Joseph {US/USj; 217 Fal- 
lis Road, Columbus, OH 43214 (US). 

(74) Agents: MORRISSEY, Bruce, W. et al.; E.I. du Pont de 
Nemours and Company, Legal/Patent Records Center, 
1007 Market Street, Wilmington, DE 19898 (US). 



(81) Designated States: AU, BR, CA, JP, US, European patent 
(AT, BE, CH, DE, DK. ES, FR, GB, GR, IE. IT. LU, 
MC, NL, FT, SE). 



Published 

^¥ith international search report. 



(54)Titie: GENES FOR MICROSOMAL DELTA- 12 FATTY ACID DESATURASES AND RELATED ENZYMES FROM 
PLANTS 

(57) Abstract 

The preparation and use of nucleic acid fragments encoding fatty acid desaturase enzymes are described. The invention 
permits alteration of plant lipid composition. Chimeric genes incorporating such nucleic add fragments with suitable regulatory 
sequences may be used to create transgenic plants with altered levels of unsaturated fatty acids. 



FOR THE PURPOSES OF INFORMATION ONLY 



Codes used to identify States party to the PCT on the front pages of pamphlets publishing international 
applications under the PCT. 



AT 


Austria 


AU 


Australia 


BB 


Bart»dos 


BE 


Belgium 


BP 


Burkina Faso 


BG 


Bulgaria 


BJ 


Benin 


BB 


Brazil 


BY 


Belarus 


CA 


Canada 


CP 


Central African Republic 


CG 


Congo 


CH 


Switzerland 


CI 


Cdte d*l voire 


CM 


Cameroon 


CM 


China 


OS 


Czechoslovakia 


cz 


Czech Republic 


DE 


Germany 


DK 


Denmark 


ES 


Spain 


PI 


Finland 


PR 


France 


CA 


Gabon 



GB 


United Kingdom 


CE 


Georgia 


GN 


Guinea 


GR 


Greece 


HU 


Hungary 


IE 


Ireland 


IT 


Italy 


JP 


Japan 


KB 


Kenya 


KG 


Kyrgystan 


KP 


Democratic Pbople'k Republic 




of Korea 


KR 


Republic ot Korea 


KZ 


Kazakhstan 


LI 


Liechtenstein 


LK 


Sri Lanka 


LU 


Luxembourg 


LV 


Latvia 


MC 


Monaco 


MD 


Republic of Moldova 


MC 


Madagascar 


ML. 


Malt 


MN 


Mongolia 



MR 


Mauritania 


MW 


Malawi 


NE 


Niger 


NL 


Netherlands 


NO 


Norway 


NZ 


New Zealand 


PL 


Poland 


PT 


Portugal 


RO 


Romania 


RU 


Russian Federation 


SD 


Sudan 


SE 


Sweden 


SI 


Slovenia 


SK 


Slovakia 


SN 


Senegal 


TD 


Chad 


TG 


Togo . 


TJ 


Tajiitisun 


TT 


Trinidad and Tobago 


UA 


Ukraine 


US 


United States of America 


uz 


U/isektstan 


VN 


Vict Nam 



wo 94/11516 



PCT/US93/09987 



1 



TTTT,F. 

GENES ''FOR MICROSOMAL DELTA-12 FATTY ACID 
DESATURASES vAND RELATED ENZYMES FROM PLANTS 
FIELD OF THE INVENTION 

5 The invention relates to the preparation and use of 

nucleic acid fragments encoding fatty acid desaturase 
enzymes to modify plant lipid composition. Chimeric 
genes incorporating such nucleic acid fragments and 
suitable regulatory sequences may be used to create 
10 transgenic plants with altered levels of unsaturated 
fatty acids. 

BACKGROUND QF THE INVENTION 

Plant lipids. have a variety of industrial and 
nutritional uses and are central to plaint membrane 

15 function and climatic adaptation. These lipids 

represent a vast array of chemical structures, and these 
structures determine the physiological and industrial 
properties of the lipid. Many of these structures 
result either directly or indirectly from metabolic 

20 processes that alter the degree of unsaturatlon of the 
lipid. Different metabolic regimes in different plants 
produce these altered lipids, and either domestication 
of exotic plant species or modification of agronomically 
adapted species is usually required to economically 

25 produce large amounts of the desired lipid* 

Plant lipids find their major use as edible oils in 
the form of triacylglycerols . The specific performance 
and health attributes of edible oils are determined 
largely by their fatty acid composition. Most vegetable 

30 oils derived from commercial plant varieties are 

composed primarily of palmitic (16:0), stearic (18:0), 
oleic (18:1)^ linolelc (18:2) and llnolenic (18:3) 
acids. Palmitic and stearic acids are, respectively, 
16- and 18-carbon-long, saturated fatty acids. Oleic, 

35 linolelc, and linolenic acids are 18-carbon-long, 
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unsaturated fatty acids containing one, twO/ and three 
double bonds ^ respectively. Oleic acid is referred to 
as a mono--unsaturated fatty acid, while linoleic and 
linolenic acids are referred to as polyunsaturated 
5 fatty acids. The relative amounts of saturated and 
unsaturated fatty acids in commonly used, edible 
vegetable oils are summarized below (Table 1) : 

TABLE 1 

Percentages of Saturated and Unsaturated Fatty 
Arijrifi in l-Tii^ Oils nf 55elected Oil CroPS 

iSatwrated Mono- Foly- 

iingafurated MlSatUrated 



Canola 


6% 


58% 


36% 


Soybean 


15% 


24% 


61% 


Corn 


13% 


25% 


62% 


Feanut 


18% 


48% 


34% 


Saf flower 


9% 


13% 


78% 




9% 


41% 


51% 


Cotton 


30% 


19% 


51% 



Many recent research efforts have examined the role 
that saturated and unsaturated fatty acids play in 

10 reducing the risk of coronary heart disease. In the 

past, it was believed that mono-unsaturates, in contrast 
to saturates and polyunsaturates, had no effect on 
serum cholesterol and coronary heart disease risk. 
Several recent hxaman clinical studies suggest that diets 

15 high in mono-unsaturated fat and low in saturated fat 
may reduce the "bad" (low-density lipoprotein) 
cholesterol while maintaining the "good" (high-density 
lipoprotein) cholesterol (Mattson et al.. Journal of 
Lipid Research (1985) 26:194-202). 

20 A vegetable oil low in total saturates and high in 

mono-unsaturates would provide significant health 
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benefits to consumers as well as economic benefits to 
oil processors. As an example, canola oil is considered 
a very healthy oil. However, in use, the high level of 
poly-unsaturated fatty acids in canola oil renders the . 
5 oil unst6a)ler easily oxidized, and susceptible to 
development of disagreeable odors and flavors 
(Gailliard, 1980, Vol. 4, pp. 85-116 In: Stumpf, P. K., 
Ed., The Biochemistry of Plants, Academic Press, New 
York) . The levels of polyunsaturates may be reduced by 

10 hydrogenation, but the expense of this process and the 
concomitant production of nutritionally questionable 
tirans isomers of the remaining unsaturated fatty acids 
reduces the overall desirability of the hydrogenated oil 
(Mensink et al.. New England J. Medicine (1990) N323: 

15 439-445) . Similar problems exist with soybean and corn 
oils • 

For specialized uses, high levels of poly- 
unsaturates can be desirable. Linoleate and linolenate 
are essential fatty acids in h\iman diets, and an edible 

20 oil high in these fatty acids can be used for 

nutritional supplements, for example in baby foods. 

Mutation-breeding programs have met with some 
success in altering the levels of poly-unsaturated fatty 
acid levels found in the edible oils of agronomic 

25 species. Examples of commercially grown varieties are 
high (85%) oleic sunflower and iow (2%) linolenic flax 
(Knowles, (1980) pp. 35-38 In: Applewhite, T. H., Ed., 
World Conference on Biotechnology for the Fats and Oils 
Industry Proceedings, American Oil Chemists* Society). 

30 Similar commercial progress with the other plants shown 
in Table 1 has been largely elusive due to the difficult 
nature of the procedure and the pleiotropic effects of 
the mutational regime on plant hardiness and yield 
potential . 
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The biosynthesis of the major plant lipids has been 
the focus of much research (Browse et al., Ann. Rev. 
Plant Physiol. Mol. Biol. (1991) 42:467-506). These 
studies show that, with the notable exception of the 
5 soluble stearoyl-acyl carrier protein desaturase, the 

controlling steps in the production of unsaturated fatty 
acids are largely catalyzed by membrane-associated fatty 
acid desaturases. Desaturation reactions occur in 
plastids and in the endoplasmic reticulum using a 
10 variety of substrates including galactolipids, sulfo- 
lipids, and phospholipids. Genetic and physiological 
analyses of Arabidopsis thaliana nuclear mutants 
defective in various fatty acid desaturation reactions 
indicates that most of these reactions are catalyzed by 

15 enzymes encoded at single genetic loci in the plant. 

The analyses show further that the different defects in 
fatty acid desaturation can have profound and different 
effects on the ultra-structural morphology, cold 
sensitivity/ and photosynthetic capacity of the plants 

20 (Ohlrogge, et al., Biochim. Biophys. Acta (1991) 

1082:1-26) . However, biochemical characterization of 
the desaturase reactions has been meager. The 
instability of the enzymes and the intractaUdility of 
their proper assay has largely limited researchers to 

25 investigations of enzyme activities in crude membrane 
preparations. These investigations have, however, 
demonstrated the role of delta-12 desaturase and 
delta- 15 desaturase activities in the production of 
linoleate and linolenate from 2-oleoyl-phosphatidyl- 

30 choline and 2-linoleoyl-phosphatidylcholine, 

respectively (Wang et al.. Plant Physiol. Biochem. 
(1988) 26:777-792). Thus, modification of the 
activities of these enzymes represents an attractive 
target for altering the levels of lipid unsaturation by 

35 genetic engineering. 
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Nucleotide secjuences encoding microsomal delta-9 
stearoyl-coenzyme-A desaturases from yeast, rat, and 
mice have been described (Stukey, et al., J. Biol. 
Chem. (1990) 265:20144-20149; Thiede, et al., J. Biol. 
5 Chem. (1986) 261:13230-13235; Kaestner, et al., J. Biol. 
Chem. (1989) 264:14755-1476). Nucleotide sequences 
encoding soluble delta-9 stearoyl-acyl carrier protein 
desaturases from higher. plants have also been described 
(Thonpson, et al., Proc. Natl. Acad. Sci. U.S.A. (1991) 

10 88:2578-2582; Shanklin et al., Proc. Natl. Acad. Sci. 
USA (1991) 88:2510-2514) . A nucleotide sequence from 
coriander plant encoding a soluble fatty acid 
desaturase, whose deduced amino acid sequence is highly 
identical to that of the stearoyl-acyl carrier protein 

15 desaturase and which is responsible for introducing the 
double bond in petroselinic fatty acid (18:1, 6c), has 
also been described [Cahoon, et. al. (1992) Proc. Natl. 
Acad. Sci. U.S.A. 89:11184-11188]. Two fatty acid 
desaturase genes from the cyanobacterium, Synecho cystis 

20 PCC6803, have been described: one encodes a fatty acid 
desaturase, designated des A, that catalyzes the 
conversion of oleic acid at the sn-1 position of 
galactolipids to linoleic acid [Wada, et al.. Nature 
(1990) 347:200-203]; another encodes a delta-6 fatty 

25 aicid desaturase that catalyzes the conversion of 

. linoleic acid at the sn-1 position of galactolipids to 
Y-linolenic acid (18:2, 6c, 9c) [WO 9306712]. Nucleotide 

sequences encoding higher plant membrane -bound 
microsomal and plastid delta-15 fatty acid desaturases 

30 have also been described [WO 9311245]; Arondel, V. et . 
al. (1992) Science 258:1353-1355]. There is no report 
of the isolation of higher plant genes encoding fatty 
acid desaturases other than the soluble delta-6 and 
delta-9 desaturases and the membrane-bound (microsomal 

35 ■ and plastid) delta-15 desaturases. While there is 
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extensive amino acid sequence identity between the 
soluble desaturases and significant amino acid sequence 
identity between the higher plant microsomal and plastid 
delta-15 desaturases, there is no significant homology 
5 between the soluble and the membrane-bound desaturases. 
Secpience-dependent protocols based on the sequences 
encoding delta-15 desaturases have been unsuccessful in 
cloning sequences for microsomal delta-12 desaturase. 
For example, nucleotide sequences of microsomal or 

10 plastid delta-15 desaturases as hybridization probes 

have been unsuccessful in isolating a plant microsomal 
delta-12 desaturase clone • Furthermore, while we have 
used a set of degenerate oligomers made to a stretch o^ 
12 amino acids, which is identical in all plant delta-15 

15 desaturases and highly conserved (10/12) in the 

cyanobacterial des A desaturase, as a hybridization 
probe to isolate a higher plant nucleotide secjuence 
encoding plastid delta-12 fatty acid desaturase, this 
method has been unsuccessful in isolating the microsomal 

20 delta-12 desaturase cDNAs. Furthermore, there has been 
no success in isolating the microsomal delta-12 
desaturase by using the polymerase chain reaction 
products derived from plant DNA, plant RNA or plant cDNA 
library using PCR primers made to stretches of amino 

25 acids that are conserved between the higher plant 

delta-15 and des A desaturases. Thus, there are no 
teachings which eneO^le the isolation of plant microsomal 
delta-12 fatty acid desaturases or plant fatty acid 
desaturase-related enzymes. Furthermore, there is no 

30 evidence for a method to control the the level of 

delta-12 fatty acid desaturation or hydroxlylation in 
plants using nucleic acids encoding delta-12 fatty acid 
desaturases or hydroxylases • 

The biosynthesis of the minor plant lipids has been 

35 less well studied. While hundreds of different fatty 
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acids have been found, many from the plant kingdom^ only 
a tiny fraction of all plants have been surveyed for 
their lipid content (Gunstone/ et al,, Eds./ (1986) The 
Lipids Handbook, Chapman and Hall Ltd., Cambridge). 
5 Accordingly, little is known about the biosynthesis of 
these unusual fatty acids and fatty acid derivatives. 
Interesting chemical features found in such fatty acids 
include, for example, allenic and conjugated double 
bonds, acetylenic bonds, trans double bonds, multiple 

10 double bonds, and single double bonds in a wide number 
of positions and configurations along the fatty acid 
chain. Similarly, many of the structural modifications 
found in unusual lipids (e.g., hydroxy lat ion, 
epoxidation, cyclization, etc.) are probably produced 

15 via further metabolism following chemical activation of 
the fatty acid by desaturation or they involve a 
chemical reaction that is mechanistically similar to 
desaturation. Many of these fatty acids and derivatives 
having such features within their structure could prove 

20 commercially useful if an agronomically viable species 

coulia be induced to synthesize them by introduction of a 
gene encoding the appropriate desaturase. Of particular 
interest are vegetable oils rich in 12-hydroxyoctadeca- 
9-enoic acid (ricinoleic acid) . Ricinoleic acid and its 

25 derivatives are widely used in the manufacture of 
lubricants, polymers, cosmetics, coatings and 
pharmaceuticals (e.g., see Gunstone, et al., Eds., 
(1986) The Lipids Handbook, Chapman and Hall Ltd,, 
Cambridge) . The only commercial source of ricinoleic 

30 acid is castor oil and 100% of the castor oil used by 
the U.S. is derived from beans grown elsewhere in the 
world, mainly Brazil. Ricinoleic acid in castor beans 
is synthesized by the addition of an hydroxyl group at 
the delta-12 position of oleic acid (Galliard & Stumpf 

35 (1966) J. Biol. Chem. 241: 5806-5812) . This reaction 
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resembles the initial reaction in a possible mechanism 
for the desaturatidn of oleate at the delta-12 position 
to linoleate since dehydration of 12-hydroxyoctadeca-9- 
enoic acid, by an enzyme activity analogous to the 
5 hydroxydecanoyl dehydrase of E. cqUL (Cronan et al. 

(1.988) J. Biol. Chem. 263:4641-4646), would result in 
the formation of linoleic acid. Evidence for the 
hydroxylation reaction being part of a general mechanism 
of enzyme-catalyzed desaturation in eukaryotes has been 

10 obtained by substituting a sulfur atom in the place of 
carbon at the delta-9 position of stearic acid. When 
incubated with yeast cell extracts the thiostearate was 
converted to a 9-sulf oxide (Buist et al. (1987) 
Tetrahedron Letters 28:857-860). This sulfoxidation was 

15 specific for sulfur at the delta-9 position and did not 
occur in a yeast delta-9-desaturase deficient mutant 
(Buist & Marecak (1991) Tetrahedron Letters 32:891-894). 
The 9-sulfoxide is the sulfur analogue of 9-hydroxyocta- 
decastearate, the proposed intermediate of stearate 

20 desaturation. 

Hydroxylation of oleic acid to ricinoleic acid in 
castor bean cells, like microsomal desaturation of 
oleate in plants, occurs at the delta-12 position of the 
fatty acid at the sn-2 position of phosphatidylcholine 

25 in microsomes (Bafor et al. (1991) Plant Physiol 

280:507-514). Furthermore, castor oleate delta-12 
hydroxylation and plant oleate microsomal delta-12 
desaturation are both inhibited by iron chelators and 
require molecular oxygen (Moreau & Stumpf (1981) Plant 

30 Physiology 67:672-676; Somerville, C. (1992) MSU-DOE 
Plant Research Laboratory Annual Report] . These 
biochemical similarities in conjunction with the 
observation that antibodies raised against cytochrome bs 
completely inhibit the activities of both oleate 

35 delta-12 desaturation in safflower microsomes and oleate 
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delta-12 hydroxylase in castor microsomes [Somerville, 
C. (1992) MSU-DOE Plant Research Laboratory Annual 
Report] comprise strong evidence that the hydroxylase 
and the desaturase are functionally related. It seems 
5 reasonable to assume, therefore, that the nucleotide 
sequence encoding a plant delta-12 desaturase would be 
useful in cloning the oleate hydroxylase gene from 
castor by secjuence-dependent protocols. For example, by 
screening a castor DNA library with oligomers based on 

10 gunino acid regions conserved between delta-12 

desaturases, or regions conserved between delta- 12 and 
other desaturases, or with oligomers based on amino 
acids conserved between delta-12 desaturases and known 
membrane-associated hydroxylases. . It would be more 

15 efficient to isolate the castor, oleate hydroxylase cDNA 
by combining the sequence dependent protocols with a 
"differential" library approach. One example of such a 
difference library would be based on different stages of 
castor seed development, since ricinoleic acid is not 

20 synthesized by very young castor seeds (less than 

12 DAP, corresponding to stage I and stage II seeds in 
the scheme of Greenwood & Bewley, Can. J. Bot. (1982) 
60:1751-1760), in the 20 days following these early 
stages the relative ricinoleate content increases from 

25 0% to almost 90% of total seed fatty acids (James et al. 
Biochem. J. (1965) 95:448-452, Canvin. Can. J. Biochem. 
Physiol. (1963) 41:1879-1885). Thus it would be 
possible to inake a cDNA "difference" library made from 
mRNA present in a stage when ricinoleic acid was being 

30 synthesized at a high rate but from which mRNA present 
in earlier stages was removed. For the earlier stage 
mRNA, a stage such as stage II (10 DAP) when ricinoleic 
acid is not being made but when other unsaturated fatty 
acids are, would be appropriate. The construction of 

35 libraries containing only differentially expressed genes 



wo 94/11516 



10 



PCr/US93/09987 



is well known in the art (Sargent. Meth. Enzymol. (1987) 
152 ;423-432) . Assembly of the free ricinoleic acid^ via 
rlcinoleoyl-CoA^ into triacylglycerol is readily 
catalyzed by canola and safflower seed microsomes (Bafor 
5 et al,, Biochem J. (1991) 280:507-514, Wiberg et al. 

10th international Symposium on the Metabolijsm, Strucure 
& Function of Plant Lipids (1992), Jerba, Tunisia) and 
ricinoleic acid is removed from phosphatidylcholine by. a 
lipase common to all oilseeds investigated. Thus, 
10 expression of the castor bean oleate hydroxylase gene in 
oil crops, such as canola seeds and soybeans , would be 
expected to result in an oil rich in triglycerides 
containing ricinoleic acid. 

SnMMARY OF THE TWVgKTTON 

15 Applicants have discovered, a means to control the 

nature and levels of unsaturated fatty acids in plants. 
Nucleic acid fragments from cDNAs or genes encoding 
fatty acid desaturases are used to create chimeric 
• genes . The chimeric genes may be used to transform 

20 various plants to modify the fatty acid composition of 
the plant or the oil produced by the plant. More 
specifically, one embodiment of the invention is an 
isolated nucleic acid fragment comprising a nucleotide 
sequence encoding- a fatty acid desaturase or a fatty 

25 acid desaturase-related enzyme with an amino acid 

identity of 50%, 60%, 90% or greater respectively to the 
polypeptide encoded by SEQ ID NOS:l, 3, 5, 7, 9, 11, or 
15. Most specifically, the invention pertains to a gene 
sequence for plant microsomal delta-12 fatty acid 

30 desaturase or desaturase-related enzyme. The plant in 
this embodiment may more specifically be soybean, 
oilseed prassica species, Arabidppsis thalianSf castor, 
and corn. 

Another embodiment of this invention involves the 
35 use of these nucleic acid fragments in sequence- 
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dependent protocols . Examples include use of the 
fragments as hybridization probes to isolate nucleotide 
sequences encoding other fatty acid desaturases or fatty 
acid desaturase-related enzymes. A related embodiment 
5 Involves using the disclosed sequences for anqpllf ication 
of RNA or DNA fragments encoding other fatty acid 
desaturases or fatty acid desaturase-related enzymes. 

Another aspect of this invention involves chimeric 
genes capable of modifying the fatty acid composition in 

10 the seed of a transformed plant, the gene comprising 
nucleic acid fragments related as defined to SEQ ID 
NOS:lr 3, 5, 1, 9/ or 15 encoding fatty acid desaturases 
or SEQ ID NOS:ll encoding a desaturase or desaturase- . 
related enzyme operably-linked in suitable orientation 

15 to suitable regulatory sequences. Preferred are those 
chimeric genes which Incorporate nucleic acid fragments 
encoding microsomal delta-12 fatty acid desaturase or 
desaturase-related enzymes. 

Yet another embodiment of the invention involves a 

20 method of producing seed oil containing altered levels 

of unsaturated fatty acids conqprising: (a) transforming 
a plant cell with a chimeric gene described above; 
(b) growing sexually mature plants from the transformed 
plant cells of step (a) ; (c) screening progeny seeds 

25 from the sexually mature plants of step (by for the 
desired levels of unsaturated fatty acids, and 
(d) processing the progeny seed of step (c) to obtain 
seed oil containing altered levels of the unsaturated 
fatty aci<is. Preferred plant cells and oils are derived 

30 from soybean, rapeseed, sunflower, cotton, cocoa, 

peanut, saf flower, coconut, flax, oil palm, and corn. 
Preferred methods of transforming such plant cells would 
include the use of Ti and Ri plasmids of Acrrobacterium. 
electr operation, and high-velocity ballistic 

35 bombardment . 
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The invention also is embodied in a method of RFLP 
bre ding to obtain altered levels of oleic acids in the 
seed oil of oil producing plant species. This method 
involves (a) making a cross between two varieties of oil 
5 producing plant species differing in the oleic acid 

trait; (b) making a Southern blot of restriction enzyme 
digested genomic DNA isolated from several progeny 
plants resulting from the cross; and (c) hybridizing the 
Southern blot with the radiolabelled nucleic acid 

10 fragments encoding the fatty acid desaturases or 
desaturase-related enzymes . 

The invention is also embodied in a method of RFLP 
mapping that uses the isolated microsomal delta-12 
desaturase cDNA or related genomic fragments described 

15 herein. 

The invention is also embodied in plants capable of 
producing altered levels of fatty acid desaturase by 
virtue of containing: the chimeric genes described 
herein. Further, the invention is embodied by seed oil 
20 obtained from such plants. 

RRTRF DES rPTPTTQM THE SEQUENCE DESCRIPTIONS 

The invention can be more fully understood from the 
following detailed description and the Sequence 
Descriptions which form a part of this application. The 

25 Sequence Descriptions contain the three letter codes for 
amino acids as defined in 37 C.F.R. 1.822 which are 
incorporated herein by reference. 

SEQ ID NO:l shows the 5' to 3' nucleotide sequence 
of 1372 base pairs of the AT^at^idopsis thaliana cDNA 

30 which encodes microsomal delta-12 desaturase. 

Nucleotides 93-95 and nucleotides 1242-1244 are, 
respectively^ the putative initiation codon and the 
termination codon of the open reading frame (nucleotides 
93-1244) . Nucleotides 1-92 and 1245-1372 are, 

35 respectively, the 5' and 3' untranslated nucleotides. 
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SEQ ID NO: 2 is the 383 amino acid protein secjuence 
deduced from the open reading frame (nucleotides 93-1244 
in SEQ ID NO:l. 

SEQ ID NO: 3 shows the 5' to 3' nucleotide sequence 
5 of 1394 base pairs of the Braaalea napus cDNA which 
encodes microsomal delta-12 desaturase in plasmid 
pCF2-165d, Nucleotides 99 to 101 and nucleotides 1248 
to 1250 are, respectively^ the putative initiation codon 
and the termination codon of the open reading frame 
10 (nucleotides 99 to 1250) . Nucleotides 1 to 98 and 1251 
to 1394 are, respectively, the 5' and 3' untranslated 
nucleotides. 

SEQ ID NO: 4 is the 383 amino acid protein seguenCse 
deduced from the open reading frame (nucleotides 99 to 

15 1250) in SEQ ID NO; 3. 

SEQ ID NO: 5 shows the 5' to 3' nucleotide sequence 
of 1369 base pairs of soybean ^Glycine max.) cDNA which 
encodes microsomal delta-12 desaturase in plasmid 
pS.F2-169K. Nucleotides 108 to 110 and nucleotides 1245 

20 to 1247 are, respectively, the putative initiation codon 
and the termination codon of the open reading frame 
(nucleotides 108 to 1247) . Nucleotides 1 to 107 and 
1248 to 1369 are, respectively, the 5' and 3* 
untranslated nucleotides. 

25 SEQ ID NO: 6 is the 381 amino acid protein sequence 

deduced from the open reading frame (nucleotides 113 to 
1258) in SEQ ID NO: 5. 

SEQ ID NO: 7 shows the 5* to 3' nucleotide sequence 
of 1790 base pairs of corn ^zea xqsl^) cDNA which encodes 

30 microsomal delta-12 desaturase in plasmid pFad2#l. 

Nucleotides 165 to 167 and nucleotides 1326 to 1328 are, 
respectively, the putative initiation codon and the 
termination codon of the open reading frame (nucleotides 
164 to 1328) . Nucleotides 1 to 163 and 1329 to 1790 
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are^ respectively; the 5' and 3' untranslated 
nucleotides , 

SEQ ID NO: 8 is the 387 amino acid protein sequence 
deduced from the open reading frame (nucleotides 164 to. 
5 1328) in SEQ ID NO: 7. 

SEQ ID NO: 9 shows the 5' to 3» nucleotide sequence 
of 673 base pairs of castor (Ricinus communla) 
incomplete cDNA which encodes part of a microsomal 
delta-12 desaturase in plasmid pRF2-lC. The sequence 
10 encodes an open reading frame from base 1 to base 673 • 

SEQ ID NO: 10 is the 219 amino acid protein sequence 
deduced from the open reading frame (nucleotides 1 to 
657) in SEQ ID NO: 9. 

SEQ ID NO: 11 shows the 5» to 3' nucleotide sequence 
15 of 1369 base pairs of castor (fiicinus communia) cDNA 

which encodes part of a microsomal delta-12 desaturase 
or desaturase-related enzyme in plasmid pRF197C-42. 
Nucleotides 184 to 186 and nucleotides 1340 to 1342 are, 
respectively, the putative initiation codon and the 
20 termination codon of the open reading frame (nucleotides 
184 to 1347) . Nucleotides 1 to 183 and 1348 to 1369 
are, respectively, the 5» and 3* untranslated 
nucleotides . 

SEQ ID NO: 12 is the 387 amino acid protein sequence 
25 deduced from the open reading frame (nucleotides 184 to 
1342) in SEQ ID N0:11. 

SEQ ID NO: 13 is the sequence of a set of 64-fold 
degenerate 26 nucleotide-long oligomers, designated NS3, 
made to conserved amino acids 101-109 of SEQ ID NO: 2, 
30 designed to be used as * sense primers in PGR to isolate 

novel sequences encoding microsomal delta-12 desaturases 
or desaturase-like enzymes, 

SEQ ID NO: 14 is the sequence of a set of 64-fold 
degenerate and 26 nucleotide-long oligomers, designated 
35 NS9, which is made to conserved amino acids 313-321 of 
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SEQ ID NO: 2 and designed to be used as antisense primers 
in PGR to isolate novel sequences encoding microsomal 
delta-12 desaturases or desaturase-like enzymes. 

SEQ ID NO: 15 shows the 5* to 3' nucleotide sequence 
5 of 2973 bp of Ayabidopsia thaliana genomic fragment 
containing the microsomal delta-12 desaturase gene 
contained in plasmid pAGF2-6. Its nucleotides 433 and 
.2938 correspond to the start and end, respectively, of 
SEQ ID N0:1. Its nucleotides 521 to 1654 are the 1134 
10 bp intron. 

SEQ ID NO: 16 is the sequence of a set of 256-fold ' 
degenerate and 25 nucleotide-long oligomers, designated 
RB5a, which is made to conserved amino acids 318-326 of 
SEQ ID NO: 2 and designed to be used as antisense primers 
15 in PGR to isolate novel sequences encoding microsomal 
delta-12 desaturases or desaturase-like enzymes. 

SEQ ID NO: 17 is the sequence of a set of 128-fo.ld 
degenerate and 25 nucleotide-long oligomers, designated 
RB5b, which is made to conserved amino acids 318-326 of 
20 SEQ ID NO: 2 and designed to be used as antisense primers 
in PGR to isolate novel sequences encoding microsomal 
delta-12 desaturases or desaturase-like enzymes. 

nF.TATT.ED DESCT TPTTQN OF THE INVENTION 

i^plicetnts have isolated nucleic acid fragments 
25 that encode plant fatty acid desaturases and that are 
useful in modifying fatty acid composition in oil- 
producing species by genetic transformation. 

Thus, transfer of the nucleic acid fragments of the 
invention or a part thereof that encodes a functional 
30 enzyme, along with suitable regulatory sequences that 
direct the transcription of their irtRNA, into a living 
cell will result in the production or over-production of 
plant fatty acid desaturases and will result in 
increased levels of unsaturated fatty acids in cellular 
35 lipids, including triacylglycerols . 
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Transfer of the nucleic acid fragments of the 
invention or a part thereof, along with suitable 
regulatory sequences that direct the transcription of 
their antisense RNA, into plants will result in the 
5 inhibition of expression of the endogenous fatty acid 
desaturase that is substantially homologous with the 
transferred nucleic acid fragment and will result in 
decreased levels of unsaturated fatty acids in cellular 
lipids, including triacylglycerols . 

10 Transfer of the nucleic acid fragments of the 

invention or a part thereof, along with suitable 
regulatory sequences that direct the transcription of 
their mRNA, into plants may result in inhibition by 
cosuppression of the expression of the endogenous fatty 

15 acid desaturase gene that is substantially homologous 
with the transferred nucleic acid fragment and may 
result in decreased levels of unsaturated fatty acids in 
cellular lipids, including triacylglycerols • 

The nucleic acid fragments of the invention can 

20 also be used as restriction fragment length polymorphism 
(RFLP) markers in plant genetic mapping and plant 
breeding programs. 

The nucleic acid fragments of the invention or 
oligomers derived therefrom can also be used to isolate 

25 other related fatty acid desaturase genes using DNA, 
RNA, or a library of cloned nucleotide sequences from 
the same or different species by well known sequence- 
dependent protocols, including, for example, methods of 
nucleic acid hybridization and amplification by the 

30 polymerase chain reaction. 

Definitions 

In the context of this disclosure, a number of 
terms shall be used. Fatty acids are specified by the 
number of carbon atoms and the number and position of 
35 the double bond: the numbers before and after the colon 
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refer to the chain 1 ngth and the number of double 
bonds f respectively. • The number following the fatty 
acid designation indicates the position of the double 
bond from the carboxyl end of the fatty acid with the 
5 "c" affix for the cls-configuration of the double bond. 
For exainple, palmitic acid (16:0), stearic acid (18:0), 
oleic acid (18:1,90), petroselinic acid (18:1, 6c), 
linoleic acid (18:2, 9c, 12c) , y-linolenic acid (18:3, 
6c, 9c, 12c) and a-linolenic acid (18:3, 9c, 12c, 15c) . 

10 Unless otherwise specified 18:1, 18:2 and 18:3 refer to ' 
oleic, linoleic and linolenic fatty acids. Ricinoleic 
acid refers to an 18 carbon fatty acid with a cis-9 
double bond and a 12-hydroxyl group. The term "fatty 
acid desaturase" used herein refers to an enzyme which 

15 catalyzes the brea)cage of a carbon-hydrogen bond and the 
introduction of a carbon-carbon double bond into a fatty 
acid molecule. The fatty acid may be free or esterified 
to another molecule including, but not limited to, acyl- 
carrier protein, coenzyme A, sterols and the glycerol 

20 moiety of glycerolipids . The term "glycerolipid 

desaturases" used herein refers to a siibset of the fatty 
acid desaturases that act on fatty acyl moieties 
esterified to a glycerol backbone. "Delta-12 
desaturase" refers to a fatty acid desaturase that 

25 catalyzes the formation of a double bond between carbon 
positions 6 and 7 (numbered from the methyl end), (i.e., 
those that correspond to carbon positions 12 and 13 
(numbered from the carbonyl carbon) of an 18 carbon-long 
fatty acyl chain. "Delta-15 desaturase" refers to a 

30 fatty acid desaturase that catalyzes the formation of a 
double bond between carbon positions 3 and 4 (numbered 
from the methyl end), (i.e., those that correspond to 
carbon positions 15 and 16 (nvuiibered from the carbonyl 
carbon) of an 18 carbon-long fatty acyl chain. Examples 

35 of fatty acid desaturases include, but are not limited 
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tOr the microsomal delta-12 and delta-15 desaturases 
that act on phosphatidylcholine lipid subs t rat s; the 
chloroplastic or plastid delta-12 and delta-15 
desaturases that act on phosphatidyl glycerol and 
5 galactolipids; and other desaturases that act on such 
fatty acid substrates such as phospholipids, galacto- 
lipids, and sulfolipids, "Microsomal desaturase" refers 
to the cytoplasmic location of the enzyme, while 
"chloroplast desaturase" and "plastid desaturase" refer 

10 to the plastid location of the enzyme. These fatty acid 
desaturases may be found in a variety of organisms 
including, but not limited to, higher plants, diatoms, 
and various eukaryotic and prokaryotic microorganisms 
such as fungi and photosynthetic bacteria and algae. 

15 The term "homologous fatty acid desaturases" refers to 
fatty acid desaturases that catalyze the same 
desaturation on the same lipid siibstrate. Thus, 
microsomal delta-15 desaturases, even from different 
plant species, are homologous fatty acid desaturases. 

20 The term "heterologous fatty acid desaturases" refers to 
fatty acid desaturases that catalyze desaturations at 
different positions and/or on different lipid 
substrates. Thus, for example, microsomal delta-12 and 
delta-15 desaturases, which act on phosphatidylcholine 

25 lipids, are heterologous fatty acid desaturases, even 
when from the same plant. Similarly, microsomal 
delta-15 desaturase, which acts on phosphatidylcholine 
lipids, and chloroplast delta-15 desaturase, which acts . 
on galactolipids, are heterologous fatty acid 

30 desaturases, even when from the same plant. It should 
be noted that these fatty acid desaturases have never 
been isolated and characterized as proteins. 
Accordingly, the terms such as "delta-12 desaturase" and 
"delta-15 desaturase" are used as a convenience to 

35 describe the proteins encoded by nucleic acid fragments 
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that have been isolated based on the phenotypic effects 
caused by their disruption. They do not imply any 
catalytic mechanism. For example^ delta-12 desaturase 
refers to the enzyme that catalyzes the formation of a 
5 double bond between carbons 12 and 13 of an 18 carbon 
fatty acid irrespective of whether it "counts" the 
carbons from the methyl, carboxyl end, or the first 
double bond. The term "fatty acid desaturase- related 
enzyme" refers to enzymes whose catalytic product may 

10 not be a carbon-carbon double bond but whose mechanism 
of action is similar to that of a fatty acid desaturase 
(that is, catalysis of the displacement of a carbon- 
hydrogen bond of a. fatty acid chain to form a fatty- 
hydroxyacyl intermediate or end-product) • Examples 

15 include delta-12 hydroxylase which means a delta-12 
fatty acid hydroxylase or the oleate hydroxylase 
responsible for the synthesis of ricinoleic acid from, 
oleic acid, , 

The term "nucleic acid" refers to a large molecule 

20 which can b^ single-stranded or double-stranded, 

composed of monomers (nucleotides) containing a sugar, a 
phosphate and either a purine or pyrimidine. A "nucleic 
acid fragment" is a fraction of a given nucleic acid 
molecule. in higher plants, deoxyribonucleic acid (DNA) 

25 is the genetic material while ribonucleic acid (RNA) is 
involved in the transfer of the information in DNA into 
proteins. A "genome" is the entire body of genetic 
material contained in each cell of an organism. The 
. term "nucleotide sequence" refers to the sequence of DNA 

30 or RNA polymers, which can be single- or double- 
stranded, optionally containing synthetic, non-natural 
or altered nucleotide bases capable of incorporation 
into DNA or RNA polymers. The term "oligomer" refers to 
short nucleotide sequences, usually up to 100 bases 

35 long. As used herein, the term "homologous to" refers 
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to the relatedness between the nucleotide seq[uence of 
two nucleic acid molecules or between the amino acid 
sequences of two protein molecules. Estimates of such 
homology are provided by either DNA-DNA or DNA-RNA 
hybridization under conditions of stringency as is well 
understood by those skilled in the art (Hames and 
Higgins, Eds. (1985) Nucleic Acid Hybridisation, IRL 
PresS/ Oxford, U.K.); or by the comparison of sequence 
similarity between two nucleic acids or proteins, such 
as by the method of Needleman et al. (J. Mol. Biol. 
(1970) 48:443-453). As used herein, •'substantially 
homologous" refers to nucleotide sequences that have 
more than 90% overall identity at the nucleotide level 
with the coding region of the claimed sequence, such as 
genes and pseudo-genes corresponding to the coding 
regions. The nucleic acid fragments described herein 
include molecules which comprise possible variations, 
both man-made and natural, such as but not limited to 
(a) those that involve base changes that do not cause a 
change in an encoded amino acid, or (b) which involve 
base changes that alter an amino acid but do not affect 
the functional properties of the protein encoded by the 
DNA sequence, (c) those derived from deletions, 
rearrangements, amplifications, random or controlled 
mutagenesis of the nucleic acid fragment, and (d) even 
occasional nucleotide sequencing errors . 

"Gene" refers to a nucleic acid fragment that 
expresses a specific protein, including regulatory 
sequences preceding (5* non-coding) and following (3* 
non-coding) the coding region. "Fatty acid desaturase 
gene" refers to a nucleic acid fragment that expresses a 
protein with fatty acid desaturase activity. "Native" 
gene refers to an isolated gene with its own regulatory 
sequences as found in nature. "Chimeric gene" refers to 
a gene that comprises heterogeneous regulatory and 
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coding dec[uences not found in nature. "Endogenous" gene 
refers to the native gen normally found in its natural 
location in the genome and is not isolated. A "foreign" 
gene refers to a gene not normally found in the host 
5 organism but that is introduced by gene transfer. 

"Pseudo-gene" refers to a genomic nucleotide sequence 
that does not encode a functional enzyme. 

"Coding sequence" refers to a DNA sequence that 
codes for a specific protein and excludes the non-coding 

10 sequences . It may constitute an "uninterrupted coding 

sequence", i.e., lacking an intron or it may include one 
or more introns bounded by appropriate splice junctions. 
An "intron" is a nucleotide sequence that is transcribed 
in the primary transcript but that is removed through 

15 cleavage and re-ligation of the RNA within the cell to 
create the mature mRNA that can be translated into a 
protein. 

"Initiation codon" and "termination codon" refer to 
a unit of three adjacent nucleotides in a coding 

20 sequence that specifies initiation and chain 

termination, respectively, of protein synthesis (mRNA 
translation) . "Open reading frame" refers to the coding 
sequence uninterrupted by introns between initiation and 
termination codons that encodes an amino acid sequence. 

25 "RNA transcript" refers to the product resulting 

from RNA polymerase-catalyzed transcription of a DNA 
sequence . When the RNA transcript is a perfect 
complementary copy of the DNA secpaence, it is referred 
to as the . primary transcript or it may be a RNA sequence 

30 derived from posttranscriptional processing of the 

primary transcript and is referred to as the mature RNA. 
"Messenger RNA (mRNA) " refers to the' RNA that is without 
introns and that can be translated into protein by the 
cell. "cDNA" refers to a double-stranded DNA that is 

35 complementary to and derived from mRNA. "Sense" RNA 
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refers to RNA transcript that includes the mRNA. 
"Antisense RNA" refers to a RNA transcript that is 
complementary to all or part of. a target primary 
transcript or mRNA and that blocks the expression of a 
5 target gene by interfering with the processing, 

transport and/or . translation of its primary transcript 
or mRNA. The conqplementarity of an antisense RNA may be 
. with any part of the specific gene transcript, i.e., at 
the 5' non-coding sequence, 3' non-coding sequence, 

10 introns, or the coding sequence. In addition, as used 
herein, antisense RNA may contain regions of ribozyme 
sequences that increase the efficacy of antisense RNA to 
block gene expression. "Ribozyme" refers to a catalytic 
RNA and includes sequence-specific endoribonucleases . 

15 As used herein, "suitable regulatory sequences" 

refer to nucleotide sequences in native or chimeric 
genes that are located upstream (5'), within, and/or 
downstream (3*) to the nucleic acid fragments of the 
invention, which control the expression of the nucleic 

20 acid fragments of the invention. The term "expression", 
as used herein, refers to the transcription and stable 
accumulation of the sense (mRNA) or the antisense RNA 
derived from the nucleic acid fragment (s) of the 
invention that, in conjunction with the protein 

25 apparatus of the cell, results in altered levels of the 
fatty acid desaturase (s) . Expression or overexpression 
of the gene involves transcription of the gene and 
translation of the mRNA into precursor or mature fatty 
acid desaturase proteins. "Antisense inhibition" refers 

30 to the production of antisense RNA transcripts capable 
of preventing the expression of the target protein. 
"Overexpression" refers to the production of a gene 
product in transgenic organisms that exceeds levels of 
production in normal or non-transformed organisms . 

35 "Cosuppression" refers to the expression of a foreign 
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gene which has substantial homology to an endog nous 
gene resulting in the suppression of expression of both 
the foreign and the endogenous gene. "Altered levels" 
refers to the production of gene product (s) in 
5 transgenic organisms in eunounts or proportions that 

differ from that of normal or non-transformed organisms • 

"Promoter" refers to a DNA sequence in a gene, 
usually upstream (5*) to its coding sequence, which 
controls the expression of the coding sequence by 

10 providing the recognition for RNA polymerase and other 
factors required for proper transcription. In 
artificial DNA constructs promoters can also be used to 
transcribe antisense RNA. Promoters may also contain 
DNA sequences that are involved in the binding of 

15 protein factors which control the effectiveness of 

transcription initiation in response to physiological or 
developmental conditions. It may also contain enhancer 
elements. An "enhancer" is a DNA sequence which can ' 
stimulate promoter activity. Xt may be an innate 

20 element of the promoter or a heterologous element 

inserted to enhance the level and/or tissue-specificity 
of a promoter. "Constitutive promoters" refers to those 
that direct gene expression in all tissues and at all 
times. "Tissue-specific" or "development-specific" 

25 promoters as referred to herein are those that direct 

gene expression almost exclusively in specific tissues, 
such as leaves or seeds, or at specific development 
stages in a tissue, such as in early or late 
embryogenesis , respectively . 

30 The "3' non-coding sequences" refers to the DNA 

sequence portion of a gene that contains a 
polyadenylation signal and any other regulatory signal 
capable of affecting mRNA processing or gene expression. 
The polyadenylation signal is usually characterized by 
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affecting the addition of polyadenylic acid tracts to 
the 3' end of the ihRNA precursor. 

"Transformation" herein refers to the transfer of a 
foreign gene into the genome of a host organism and its 
5 genetically stable inheritance, "Restriction fragment 
length polymorphism" (RFLP) refers to different sized 
restriction fragment lengths due to altered nucleotide 
sequences in or around variant forms of genes . 
"Molecular breeding" refers to the use of DNA-based 
10 diagnostics^ such as RFLP^ RAPDs, and PGR in breeding. 
"Fertile" refers to plants that are able to propagate 
sexually. 

"Plants" refer to photosynthetic organisms, both 
eukaryotic and prokaryotic, whereas the term "Higher 

15 plants" refers to eukaryotic plants. "Oil-producing 

species" herein refers to plant species which produce 
and store triacylglycerol in specific organs, primarily 
in seeds . Such species include soybean (Glycine xoas) , 
rapeseed and canola (including Brassica napu Sf £* 

20 Gampestris ^ , sunflower (Helianthus annus) , cotton 

f Gossypium hirsntum) , corn <zea ma^) , cocoa (Theobroma 

gacao ^ , saf flower rcarthamus tlnCtOriUS) f oil palm 
f Elaeis giiineensis ) , coconut palm {Qqsq^ nuclfeya) r flax 
(Iiiniua ti ft ^ t at i fi s imum ^ , castor /Ricinua communis) and 

25 peanut f Ar-aehis h ypogaea ^ . The group also includes non- 
agronomic species which are useful in developing 
appropriate expression vectors such as tobacco, rapid 
cycling Brassica species, and Arabidopsis thaliftna, and 
wild species which may be a source of unique fatty 

30 acids. 

"Sequence-dependent protocols" refer to techniques 
that rely on a nucleotide sequence for their utility. 
Exanqples of secpience-dependent protocols include, but 
are not limited to, the methods of nucleic acid and 
35 oligomer hybridization and methods of DNA and RNA 
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amplification such as are exemplified in various uses of 
the polymerase chain reaction (PGR) • 

Various solutions used in the experimental 
manipulations are referred to by their common neunes such 
5 as "SSC", "SSPE", "Denhardt's solution", etc. The 

composition of these solutions may be found by reference 
to Appendix B of Sambrook, et al. (Molecular Cloning, A 
Laboratory Manual, 2nd ed. (1989), Cold Spring Harbor 
Leiboratory Press) . 
10 T^DNA Mul -agi^nftsis and Identification of an 

Arahidopsis Mutant Defective in 
Migrosomal Delta-12 Desaturation 
In T-DNA mutagenesis (Feldmann, et al.. Science 
(1989) 243:1351-1354), the integration of T^DNA in the 
15 genome can interrupt normal expression of the gene at or 
near the site of the integration. If the resultant 
mutant phenotype can be detected and shown genetically 
to be tightly linked to the T-DNA insertion, then the 
••tagged" mutant locus and its wild type counterpart can 
20 be readily isolated by molecular cloning by one skilled 
in the art. 

Arabldopsis thaliana seeds were transformed by 
A<pryr>bagt^r-ium i-»tnof ar> ^ ^ns C58Clrif Strain harboring the 
avirulent Ti-plasmid pGV3850 : :pAK1003 that has the T-DNA 

25 region between the left and right T-DNA borders replaced 
by the origin of replication region and ampicillin 
resistance gene of plasmid pBR322, a bacterial kanamycin 
resistance gene, and a plant kanamycin resistance gene 
(Feldmann, et al,, Mol. Gen. Genetics (1987) 208:1-9). 

30 Plants from the treated seeds were self-fertilized and 
the resultant progeny seeds, germinated in the presence 
of kanamycin, were self-fertilized to give rise to a 
population, designated T3, that was segregating for 
T-DNA insertions. T3 seeds from approximately 1700 T2 

35 plants were germinated and grown under controlled 
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environment. One leaf from each of ten T3 plants of each 
line were pooled and analyzed for fatty acid 
composition. One line, designated 658, showed an 
incresed level of oleic acid (18:1). Analysis of twelve 
5 individual T3 seeds of line 658 identified two seeds 
that contained greater than 36% oleic acid while the 
remaining seeds contained 12-22% oleic acid. The mutant 
phenotype of increased level of oleic acid in leaf and 
seed tissues of line 658 and its segregation in 

10 individual T3 seeds suggested that line 658 harbors a 
mutation that affects desaturation of oleic acid to 
linoleic acid in both leaf and seed tissues. When 
approximately 200 T3 seeds of line 658 were tested for 
their ability to germinate in the presence of kanamycin, 

15 four kanamycin-sensitive seeds were identified, 

suggesting multiple, possibly three, T-DNA inserts in 
the original T2 line. When progeny seeds of 100 
individual T3 plants were analyzed for fatty acid 
composition and their ability to germinate on kanamycin, 

20 one plant, designated 658-75, was identified whose 
progeny segregated 7 wild type: 2 mutant for the 
increased oleic acid and 28 sensitive: 60 resistant for 
kanamycin resistance. Approximately 400 T4 progeny 
seeds of derivative line 658-75 were grown and their 

25 leaves analyzed for fatty acid composition. Ninety one 
of these seedlings were identified as homozygous for the 
mutant <high oleic acid) phenotype. Eighty-three of 
these homozygous plants were tested for the presence of 
nopaline, another marker for T-DNA, and all of them were 

30 nopaline positive. On the basis of these genetic 

studies it was concluded that the mutation in microsomal 
delta-12 desaturation was linked to the T-DNA. 
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Tsolation of Arabldopsis 658-75 Genoiriic DNA 
pontiainina t:he Disr tipl-ed Gene CQntrollina 
jwiigrosoTnal Deli:a-12 Desaturatiion 
In order to isolate the gene controlling microsomal 
5 delta-12 desaturation from wild-type ArabldOPSitSf a 
T-DNA-plant DNA "junction" fragment containing a T-DNA 
border integrated into the host plant DNA was isolated 
from the homozygous mutant plants of the 658-75 line of 
Aya>^4rinpgls , For this, genomic DNA from the mutant 

10 plant was isolated and completely digested by either Bam 
HI or Sal I restriction enzymes. In each case, one of 
the resultant fragments was expected to contain the 
origin of replication and anqpicillin-resistance gene of 
PBR322 as well as the left T-DNA-plant DNA junction 

15 fragment. Such fragments were rescued as plasmids by 
ligating the digested genomic DNA fragments at a dilute 
concentration to facilitate self-ligation and then using 
the ligated fragments to transform £. coli cells. While 
no ampicillin-resistant colony was obtained from the 

20 plasmid rescue of Sal I-digested plant genomic DNA, a 

single ampicillin-resistant colony was obtained from the 
plasmid rescue of Bam Hl-digested plant genomic DNA. 
The plasmid obtained from this transformant was 
designated p658-l: Restriction analysis of plasmid 

25 p658-l with Bam HI, Sal I and Eco RI restriction enzymes 
strongly suggested that it contained the expected 
14.2 kb portion of the T-DNA (containing pBR322 
sequences) and a putative plant DNA/left T-DNA border 
fragment in a 1 . 6 kB Eco RI-Bam HI fragment . The 1 . 6 kb 

30 Eco RI-Bam HI fragment was subcloned into pBluescript SK 
[Stratagene] by standard cloning procedures described in 
Sarobrook et al., (Molecular Cloning, A Laboratory 
Manual, 2nd ed. (1989), Cold Spring Harbor Laboratory 
Press) and the resultant plasmid, designated pS1658. 
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TsQlation Qf Microsoinal Delta-12 Desatnrase cPNA 
and Gg^ne from Wild tvpe Arabidopsis 
The 1.6 Jcb Eco RI-Bam HI fragment, which contained 
the putative plant DNA flanking T-DNA, in plasmid p658-l 
5 was isolated and used as a radiolabeled hybridization 
probe to screen a cDNA library made to polyA**" mRNA from 
the above-ground parts of Arabidopsis thallana plants, 
which varied in size from those that had just opened 
their primary leaves to plants which had bolted and were 

10 flowering [Elledge et al. (1991) Proc. Natl. Acad Sci . 

USA 88:1731-1735]. The cDNA inserts in the library were 
made into an Xho I site flanked by Eco RI sites in 
lambda Yes vector [Elledge et al. (1991) Proc' Natl. 
Acad Sci. USA 88:1731-17351. Of the several positively- 

15 hybridizing plaques, four were subjected to plaque 

purification. Plasmids were excised from the purified 
phages by site-specific recombination using the cre-lox 
recombination system in £. cQli strain BNN132 [Elledge 
et al. (1991) Proc. Natl. Acad Sci- USA 88:1731-1735]. 

20 The four excised plasmids were digested by Eco RI 

restriction enzyme and shown to contain cDNA inserts 
ranging in size between 1 kB and 1.5 kB. Partial 
nucleotide sequence determination and restriction enzyme 
mapping of all four cDNAs revealed their common 

25 identity. 

The partial nucleotide sequences of two cDNAs, 
designated pSF2b and p92103r containing inserts of ca. 
1.2 kB and ca. 1.4 kB, respectively, were determined. 
The composite sequence derived from these plasmids is 

30 shown as SEQ ID NO:l and is expected to be contained 

completely in plasmid p92103. SEQ ID N0:1 shows the 5V 
to 3' nucleotide sequence of 1372 base pairs of the 
Arabidopsis cDNA which encodes microsomal delta-12 fatty 
acid desaturase. Nucleotides 93-95 are the putative 

35 initiation codon of the open reading frame (nucleotides 
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93-1244), (identified by comparison of other plant 
delta-12 desaturases in this application) . Nucleotides 
1242-1244 are the termination codon. Nucleotides 1 to 
92 and 1245-1372 are the 5* and 3' untranslated 
5 nucleotides, respectively. The 383 amino acid protein 
sequence in SEQ ID NO: 2 is that deduced from the open 
reading frame and has an estimated molecular weight of 
44 kD. 

The gene corresponding to SEQ ID NO:l was isolated 

10 by screening an Arab! dons is genomic DNA library using 
radiolabeled pSF2b cDNA insert, purifying the 
positively-hybridizing plaque, and subcloning a 6 kB 
Hind III insert fragment from the phage DNA in 
pBluescript vector. The sequence of 2973 nucleotides of 

15 the gene Is shown in SEQ ID NO: 15. Comparison of the 
sequences of the gene (SEQ ID NO: 15) and the cDNA (SEQ 
ID N0:1) revealed the presence of a single intron of 
1134 bp at a position between nucleotide positions 88 
and 89 of the cDNA, which is 4 nucleotides 5' to the 

20 initiation codon. 

The 1.6 kB Eco RI-Bam HI genomic border fragment 
insert in pS1658 was also partially sequenced from the 
Bam HI and Eco RI ends. Coxnparison of the nucleotide 
sequences of the gene (SEQ ID NO: 15), the cDNA (SEQ ID 

25 N0:1), the border fragment, and the published sequence 
of the left end of T-DNA (Yadav et al., (1982) Proc. 
Natl. Acad. Sci. 79:6322-6326) revealed that a) the 
sequence of the first 451 nucleotides of the border 
fragment from the Bam HI end is collinear with that of 

30 nucleotides 539 (Bam HI site) to 89 of the cDNA, b) from 
the Eco RI end, the border fragment is collinear from 
nucleotides 1 to 61 with that of the left end of T-DNA 
(except for a deletion of 9 contiguous nucleotides at 
position 42 in the border fragment) , and is collinear 

35 from nucleotides 57 to 104 with that of nucleotides 
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41-88 of the cDNA, and c) the sequence divergences 
between the border fragment and the cDNA are due to the 
presence of the intron in the border fragment. These 
results show that the T-DNA disrupted the microsomal 
5 delta-12 desaturase gene in the transcribed reigion 

between the promoter and the coding region and 5* to the 
intron in the untranslated sequence. 

A phage DNA containing Ayahidopsis microsomal 
delta-12 desaturase gene was used as a RFLP marker on a 

10 Southern blot containing genomic DNA from several 

progeny of Arabidopsis thaliana (ecotype Wassileskija 
and marker line WlOO ecotype Landesberg background) 
digested with Hind III. This mapped the microsomal 
delta-12 desaturase gene 13.6 cM proximal to locus 

15 C3838, 9.2 cM distal to locus lAt228, and 4 . 9 cM 

proximal to Fad D locus on chromosome 3 [Koorneef, M. et 
al., (1993) in Genetic Maps^ Ed. O'Brien, S. J.; Yadav 
et al. (1993) Plant Physiology 1112: 467-476] . This 
position corresponds closely to previously suggested 

20 locus for microsomal delta-12 desaturation (Fad 2) 
[Hugly, S. et al., (1991) Heredity 82:4321]. 

The open reading frames in SEQ ID NO:l and in 
sequences encoding Ayahidopgip microsomal delta-15 
desaturase [WO 9311245], Arabidopsls plastid delta-15 

25 desaturase (WO 9311245], and cyanobacterial desaturase, 
des A, [Wada, et al.. Nature (1990) 347:200-203; Genbank 
IDrCSDESA; GenBank Accession No:X53508] as well as their 
deduced amino acid sequences were compared by the method 
of Needleman et al . [J. Mol. Biol. (1970) 48:443-453] 

30 using gap weight and gap length weight values of 5.0 and 
0.3, respectively, for the nucleotide sequences and 3.0 
and 0.1, respectively, for protein sequences. The 
overall identities are summarized in Table 2 • 
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TABLE 2 

Percent Identity Between Different Fatty Acid 
Desatmraaes af. ^he Nucleotide and Amino Acid Levels 

slI SLSi des A 

a2 nucleotide 48(8 gaps) 46(6 gaps) 43(10 gaps) 

amino acid 39(9 gaps) 34(8 gaps) 24(10 gaps) 

a3 nucleotide - 65(1 gap) 43(9 gaps) 

amino acid - 65(2 gaps) 26(11 gaps) 

ad nucleotide - - 43(9 gaps) 

amino acid - - 26(11 gaps) 

a2, a3, ad, and des A refer, respectively, to SEQ 
ID NO: 1/2, Arabldopsis microsomal delta-15 desaturase, 
Arabldopsis plaistid delta-15 desaturase, and 
cyanobacterial desaturase, des A. The percent 
5 identities in each comparison are shown at both the 
nucleotide and sunino acid levels; the number of gaps 
imposed by the comparisons are shown in brackets 
following the percent identities. As expected on the 
basis of unsuccessful atten^ts in using delta-15 fatty 

10 acid nucleotide sequences as hybridization probes to 
isolate nucleotide sequences encoding microsomal 
delta-12 fatty acid desaturase, the overall homology at 
the nucleotide level between microsomal delta-12 fatty 
acid desaturase (SEQ ID N0:1) and the nucleotide 

15 sequences encoding the other three desaturases is poor 
(ranging between 43% and 4 8%) . At the amino acid level 
too, the microsomal delta-12 fatty acid desaturase (SEQ 
ID NO: 2) is poorly related to cyanobacterial des A (less 
than 24% identity) and. the plant delta-15 desaturases 

20 (less than 39% identity) • 

While the overall relatedness between the deduced 
amino acid sequence of the said invention and the 
published fatty acid desaturases is limited, more 
significant identities are observed in shorter stretches 
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of amino acid seqiiences in the above comparisons. These 
results confirm d that the T-DNA in line 658-75 had 
interrupted the normal expression of a fatty acid 
desaturase gene. Based on the fatty acid phenotype of • 
5 homozygous mutant line 658-75, Applicants concluded that 
SEQ ID NO:l encoded the delta-12 desaturase. Further, 
Applicants concluded that it was the microsomal delta-12 
desaturase, and not the chloroplastic delta-12 
desaturase, since: a) the mutant phenotype was 

10 expressed strongly in the seed but expressed poorly, if 
at all, in the leaf of line 658-75, and b) the delta-12 
desaturase polypeptide, by comparison to the microsomal 
and plastid delta-15 desaturase polypeptides 
[WO 9311245], did not have an N-terminal extension of a 

15 transit peptide expected for a nuclear-encoded plastid 
desaturase . 

Plasmid p92103 was deposited on October 16, 1992 
with the American Type Culture Collection of Rockville, 
Maryland under the provisions of the Budapest Treaty and 
20 bears accession number ATCC 69095. 

Ryprf^s.sion Of M-ir^rosomal Delta-12 Fattv Add DftSaturase 
Tn Arabidops -ift Mutant To Complement Its Mutation 

Tn Delta-12 Fatitv Acid Deaaturation 
To confirm the identity of SEQ ID N0:1 fArabldopsis 
25 microsomal delta-12 fatty acid desaturase cDNA) a 

chimeric gene comprising of SEQ ID NO:l was transformed 
into an Arabidopsis mutant affected in microsomal 
delta-12 fatty acid desaturation. For this, the ca. 
1.4 kb Eco RI fragment containing the cDNA (SEQ ID N0:1) 
30 was isolated from plasmid p92l03 and sub-cloned in 
pGA748 vector [An et . al.(1988) Binary Vectors. In: 
Plant Molecular Biology Manual. Eds Gelvin, S. B. et al. 
Kluwer Academic Press], which was previously linearized 
with Eco RI restriction enzyme. In one of the resultant 
35 binary plasmid, designated pGA-Fad2, the cDNA was placed 
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in the sense orientation behind the CaMV 35S promotor of 
the vector to provide constitutive expression. 

Binary vector pGA-Fad2 was transformed by the 
freeze/thaw method [Holsters et al. (1978) Mol. Gen. 
5 Genet. 163:181-187] Into Agrobacterium tume:faclens 
Strain RIOOO, carrying the Ri plasmid pRiA4b from 
A^yni^aeterium rhizogenes [Moore et al., (1979) Plasmid 
2:617-626] to result in transformants R1000/pGA-Fad2 • 

Agrobacterium strains RIOOO and R1000/pGA-Fad2 were 

10 used to transform Arabidopsis mutant fad2-l [Miquel, M. 
& Browse, J, (1992) Journal of Biological Chemistry 
267:1502-1509] and strain RIOOO was used to transform 
wild type firabidopsis . Young bolts of plants were 
sterilized and cut so that a single node was present in 

15 each explant . Explants were inoculated by Agrobacteria 
and incubated at 25^C in the dark on drug-free MS 
minimal organics medivim with 30 g/L sucrose (Gibco) . 
After four days, the explants were transferred to fresh 
MS medixim containing 500 mg/L cefotaxime and 250 mg/ml 

20 carbenicillin for the counterselection of Agrobacterium . 
After 5 days, hairy roots derived from R1000/pGA-Fad2 
transformation were excised and transferred to the same 
medium containing 50 mg/ml kanamycin. Fatty acid methyl 
esters were prepared from 5-10 mm of -the roots 

25 essentially as described by Browse et al., (Anal. 

Biochem. (1986) 152:141-145) except that 2.5% H2SO4 in 
methanol was used as the methylation reagent and samples 
were heated for 1.5 h at 80**C to effect the methanolysis 
of the seed triglycerides . The results are shown in 

30 Table 3. Root samples 41 to 46, 48 to 51, 58, and 59 
are derived from transformation of fad2<-l plants with 
RlOOO/pGA Fad2; root samples 52, 53, and 57 were derived 
from transformation of fad2-l plants with RIOOO and 
serve as controls; root sample 60 is derived from 
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transformation of wild type AgabAdOPSis with RIOOO and 
also serves as a control. 

TftBLE 3 

Fatty acid Conqposition in Transgenic 
ftr f^h>idQPsis fad2-l Hairy Roots Transformed 
yyittl 3Vryi-r>V^ar;f.eriiim RIQQO /pRA-f ad2 



Sample 


1 6; 0 


16:1 


18 ; 0 


IB i 1 






41 


24 .4 


1.8 


1.7 


5.0 


29.4 


33.8 




25 . 6 


3.7 


1.3 


20.0 


22.0 


27.5 


HO 


23 6 




1.6 


7.2 


27.6 


36.1 


HH 


OA A 

m H 


1 3 


4 . 6 


16.0 


18.1 


33.6 


45 


20.7 




8.1 


44 . 7 


11.8 


14 .8 


46 


20.1 




1.8 


7.5 


33.7 


36.0 


48 


26.1 


2.9 


2.1 


9.5 


17.6 


33.4 


49 


30.8 


1.0 


2.4 


8.7 


18.7 


31.1 


50 


19.8 


1.9 


3.3 


27.7 . 


21.8 


24 .4 


51 


20.9 


1.1 


5.0 


13.7 


25.0 


32.1 


58 


23.5 


0.3 


1.4 


3.6 


22.1 


45.9 


59 


22.6 


0.6 


1.4 


2.8 


29.9 


40.4 


52, cont . 


12.3 




2.6 


64.2 


4.6 


16.4 


53, cont. 


20.3 


9.1 


2.2 


55.2 


1.7 


9.2 


57, cont. 


10.4 


2.4 


0.7 


65.9 


3.8 


12.7 


60, WT 


23.0 


1.7 


0.8 


6.0 


35.0 


31.8 



5 These results show that expression of ArafrldOPSis 

microsomal delta-12 desaturase in a mutant ArabidOPSis 
lacking delta-12 desaturation can result in partial to 
complete complementation of the mutant. The decrease in 
oleic acid levels in transgenic roots is accompanied by 
10 increases in the levels of both 18:2 and 18:3. Thus, 

overexpression of this gene in other oil crops, especially 
canola, which is a close relative of Arabidopsis and which 
naturally has high levels of 18:1 in seeds, is also expected 
to result in higher levels of 18:2, which in conjunction with 
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the overexpression of the microsomal delta-15 fatty acid 
desaturase will result in very high levels of 18:3. 

Uaina A rabidopsis Migroaomal Delta-12 Desaturase 
cDNA as a Hybrldirat -t nn Probe to Isolate Microsomal 
5 Delta-12 Desaturaae ePNAs from Other Plant Species 

Evidence for conservation of the delta-12 
desaturase sequences amongst species was provided by 
using the Arabldepsis cDNA insert from pSF2b as a 
hybridization probe to clone related sequences from 

10 Brassica pa pus ^ and soybean. Furthermore^ corn and 
castor bean microsomal delta-12 fatty acid desaturase 
were isolated by PGR using primers made to conserved 
regions of microsomal delta-rl2 desaturases. 

Cloning of a Brassica napus Seed 

15 cDNA Enc odinc^ Seed Microsomal Delta-12 

Fatty Acid Desaturase 
For the purpose of cloning the Brassica seed 
cDNA encoding a delta-12 fatty acid desaturase r the cDNA 
insert from pSF2b was isolated by digestion of pSF2b 

20 with EcoR I followed by purification of the 1.2 kb 

insert by gel electrophoresis. The 1.2 kb fragment was 
radiolabeled and used as a hybridization probe to screen 
a lambda phage cDNA library made with poly A"** mRNA from 
developing Brassica napus seeds 20-21 days after 

25 pollination. Approximately 600,000 plaques were 

screened under low stringency hybridization conditions 
(50 mM Tris pH 7.6, 6X SSC, 5X Denhardt's, 0.5% SDS, 
100 ug denatured calf thymus DNA. and 50®C) and washes 
(two washes with 2X SSC, 0.5% SDS at room teirqperature 

30 for 15 min each, then twice with 0.2X SSC, 0.5% SDS at 
room temperature for 15 min each, and then twice with 
0.2X SSC, 0.5% SDS at 50**C for 15 min each). Ten 
strongly-hybridizing phage were plaque-purified and the 
size of the cDNA inserts was determined by PGR 

35 anplication of the insert using phage as template and 
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T3/T7 oligomers for primers. Two of these phages^ 165D 
and 165F, had PGR amplified inserts of 1.6 kb and 1.2 kb 
respectively and these phages were also used to excise 
the phagemids as described above. The phagemid derived 
5 from phage 1650, designated pCF2-165D, contained a 
1.5 kb insert which was sequenced completely on one 
strand. SEQ ID NO: 3 shows the 5' to 3' nucleotide 
sequence of 1394 base pairs of the Brasslca napus cdna 
which encodes delta-12 desaturase in plasmid pCF2-165d. 

10 Nucleotides 99 to 101 and nucleotides 1248 to 1250 are, 
respectively, the putative initiation codon and the 
termination codon of the open reading frame (nucleotides 
99 to 1250). Nucleotides Ito 98 and 1251 to 1394 are, 
respectively, the 5' and 3' untranslated nucleotides. 

15 The 383 amino acid protein sequence deduced from the 
open reading frame in SEQ ID NO: 3 is shown in SEQ ID 
NO: 4- While the other strand can easily be sequenced 
for confirmation, comparisons of SEQ ID NOS:l and 3 and 
of SEQ ID N0S:2 and 4, even admitting of possible 

20 sequencing errors, showed an overall homology of 

approximately 84% at both the nucleotide and amino acid 
levels, which confirmed that pCF2-165D is a Brassica 
na pus seed cDNA that encoded delta-12 desaturase. 
Plasmid pCF2-165D has been deposited on October 16, 1992 

25 with the American Type Culture Collection of Rockville, 
Maryland under the provisions of the Budapest Treaty and 
bears accession number ATCC 69094. 

C!lQning of a S oybean ^Glycine max) 
/^nNA EncQdj n^ Seed Microsomal Delta-12 

30 Pafj-y Aeid Deaaturaae 

A cDNA library was made to poly A"*" mRNA isolated 
from developing soybean seeds, and screened as described 
above. Radiolabelled probe prepared from pSF2b as 
described above was added, and allowed to hybridize for 
35 18 h at 50*'C. The probes were washed as described 
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above. Autoradiography of the filters indicated that 
there were 14 strongly hybridizing plaques^ and nximerous 
weakly hybridizing plaques. Six of the 14 strongly 
hybridizing plaques were plaque purified as described 
5 above and the cDNA insert size was determined by PGR 
axnplicatlon of the insert using phage as template and 
T3/T7 oligomers for primers. One of these phages, 169K, 
had an Insert sizes of 1.5 kb and this phage was also . 
used to excise the phagemid as described above. The 

10 phagemid derived from phage 169K, designated pSF2-169K, 
contained a 1.5 kb insert which was sequenced completely 
on both strands. SEQ ID NO: 5 shows the 5* to 3' 
nucleotide sequence of 1473 base pairs of soybean 
r ciyelne max ^ cDNA which encodes delta-12 desaturase in 

15 plasmld pSF2-169K. Nucleotides 108 to 110 and 

nucleotides 1245 to 1247 are, respectively, the putative 
initiation codon and the termination codon of the open 
reading frame (nucleotides 108 to 1247) . Nucleotides 1 
to 107 and 1248 to 1462 are, respectively, the 5' and 3' 

20 untranslated nucleotides. The 380 amino acid protein 
secpaence deduced from the open reading frame in SEQ ID 
NO: 5 is shown in SEQ ID NO: 6. Comparisons of SEQ ID 
NOS:l and 5 and of SEQ ID N0S:2 and 6, even admitting of 
possible sequencing errors, showed an overall homology 

25 of approximately 65% at the nucleotide level and 
approincimately 70% at the amino acid level, which 
confirmed that pSF2-l69K is a soybean seed cDNA that 
encoded delta-12 desaturase. A further description of 
this clone can be obtained by comparison of the SEQ ID 

30 NO:l, SEQ ID NO: 3, and SEQ ID NO: 5 and by analyzing the 
phenotype of transgenic plants produced by using 
chimeric genes incorporating the insert of pSF2-169K, in 
sense or antisense orientation, with suitable regulatory 
sequences. Plasmid pSF2-169K was deposited on 

35 October 16, 1992 with the American Type Culture 
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Collection of Rockville, Maryland under the provisions 
of the Budapest Treaty and bears accession number 
ATCC 69092. 

rinning of a Corn fZea mavs) 
5 cDNA Encftriino S^ ed Mlerogomal Delta-12 

FaM-y Ae^d Desaturase 
Corn microsoraal delta-12 desaturase cDNA was 
isolated using a PGR approach. For this, a cDNA library 
was made to poly A+ RNA from developing corn embryos in 

10 Lambda Zap II vector. This library was used as template 
for PGR using sets of degenerate oligomers NS3 (SEQ ID 
NO: 13) and RB5A/B (SEQ ID NOS:16 and 17) as sense and 
antisense primers r respectively. NS3 and RB5A/B 
correspond to stretches of amino acids 101-109 and 

15 318-326, respectively/ of SEQ ID N0:2, which are 

conserved in most' microsomal delta-12 desaturases (for 
example, SEQ ID N0S:2, 4, 6, 8). PCR was carried out 
using a PCR kit (Perkin-Elmer) by 40 cycles of 94*'C 1'^ 
45''C/ 1 min, and 55**C, 2 min. Analyses of the PCR 

20 products on an agarose gel showed the presence of a 

product of the expected size (720 bp) , which was absent 
in control reactions containing either the sense or 
antisense primers alone. The fragment was gel purified 
and then used as a probe for screening the corn cDNA 

25 library at eO'^C as described above. One positively- 
hybridizing plaque was purified and partial sequence 
determination of its cDNA showed it to be a nucleotide 
sequence encoding microsomal delta-12 desaturase but 
truncated at the 3' end. The cDNA insert encoding the 

30 partial desaturase was gel isolated and used to probe 
the corn cDNA library again. Several positive plaques 
were recovered and characterized. DNA sequence analysis 
revealed that all of these clones seem to represent the 
same sequence with the different length of 5* or 3' 

35 ends. The clone containing the longest insert. 
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designated pFad2#l, was secjuenced completely. The total 
length of the cDNA is 1790 bp (SEQ ID NO: 7) comprising 
of an open reading frame from nucleotide 165 to 1328 bp 
that encoded a polypeptide of 388 amino acids. The 
5 deduced amino acid sequence of the polypeptide (SEQ ID 
NO: 8) shared overall identities of 71%^ 40%, and 38% to 
^rabldopsia microsomal delta-12 desaturase, Arabidppsis 
microsomal delta-15 desaturase, and Arabidopsis plastid 
delta-15 desaturase, respectively. Furthermore, it 
10 lacked an N-terminal amino acid extension that would 
indicate it is a plastid enzyme. Based on these 
considerations, it is concluded that it encodes a 
microsomal delta-12 desaturase. 

T«r>1a1-irin of cDWAfi Encoding 

15 Delta-12 Microsomal Pattv Acid Desaturases and 

nAsaturas^-^Related Enzvmes from Castor Bean Seed 
Polysomal mRNA was isolated from castor beans of 
stages I-II (5-10 DAP) and also from castor beans of 
stages IV-V (20-25 DAP) . Ten ng of each mRNA was used 

20 for separate RT-PCR reactions, using the PerJcin-Elmer 
RT-PCR kit. The reverse transcriptase reaction was 
primed with random hexamers and the PGR reaction with 
degenerate delta-12 desaturase primers NS3 and NS9 (SEQ 
ID NOS:13 and 14). The annealing-extension ten5>erature 

25 of the PGR reaction was 50**C. A DNA fragment of approx. 
700 bp was amplified from both stage I-II and stage IV-V 
mRNA. The amplified DNA fragment from one of the 
reactions was gel purified and cloned into a pGEM-T 
vector using the Promega pGEM-T PGR cloning kit to 

30 create the plasmid pRPiZ-lC. The 700 bp insert in 
PRF2-1C was sequenced, as described above, and the 
resulting DNA sequence is shown in SEQ ID NO: 9. The DNA 
sequence in SEQ ID NO: 9 contains an open-reading frame 
encoding 219 amino acids (SEQ ID NO: 10) which has 81% 

35 identity (90% similarity) with amino acids 135 to 353 of 



wo 94/11516 



PCT/US93/09987 



40 

the Arabidopsis microsomal delta-12 desaturase described 
in SEQ ID NO:2. The cDNA insert in pRF2-lC is therefore 
a 676 bp fragment of a full-length cDNA encoding a 
castor bean seed microsomal delta-12 desaturase. The 
5 full length castor bean seed microsomal delta-12 

desaturase cDNA may isolated by screening a castor seed 
CDMA library, at 60*0/ with the labeled insert of 
pRF2-lC as described in the example above. The insert 
in pRF2-lC may also be used to screen castor bean 
10 libraries at lower temperatures to isolate delta-12 
desaturase-related sequences, such as the delta-12 
hydroxylase. 

A cDNA library made to poly A+ mRNA isolated from, 
developing castor beans (stages IV-V, 20-25 DAP) was 

15 screened as described above. Radiolabeled probe 

prepared from pSF2b or pRF2-lC> as described above, were 
added, and allowed to hybridize for 18 h at 50®C. The 
filters were washed as described above. Autoradiography 
of the filters indicated that there were numerous 

20 hybridizing plaques, which appeared either strongly- 
hybridising or weakly-hybridising. Three of the 
strongly hybridisng plaques (190A-41, 190A-42 and 
190A-44) amd three of the weakly hybridising plaques, 
(190B-41, 190b-43 and 197c-42), were plaque purified 

25 using the methods described above. The cDNA insert size 
of the purified phages were determined by PGR 
amplication of the insert using phage as template and 
lambda-gtll oligomers (Clontech larobda-gtll Axnplimers) 
for primers. The PCR-amplif ied inserts of the amplified 

30 phages were subcloned into pBluescript (Pharmacia) which 
had been cut with Eco RI and filled in with Klenow 
(Sambrook et al. (Molecular Cloning, A Laboratory 
Approach, 2nd. ed. (1989) Cold Spring Harbor Laboratory 
Press) . The resulting plasmids were called pRF190a-41, 

35 pRF190a-42, pRF190a-44, pRF190b-41, pRF190b-43 and 
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pRF197c-42. All the inserts were about 1 . 1 kb with 
the exception of pRF197c-42 which was approx, 1,5 kb. 
The inserts in the plasmids were sequenced as described 
above • The insert in pRF190b-43 did not contain any 
5 open reading frame and was not identified. The inserts 
in pRF190a-41, pRF190a-42r pRF190a-44 and pRF190b-41 
were identical. The insert in pRF197c-42 contained all 
of the nucleotides of the Inserts in pRF190a-41, 
pRF190a-42, pRF190a-44 and pRF190b-41 plus an additional 

10 approx. 400 bp. It was deduced therefore that the 

insert in pRF197c-42 was a longer version of the inserts' 
in pRF190a-41, pRF190a-42, pRF190a-44 and pRF190b-41 and 
all were derived from the same full-length mRNA. The 
complete cDNA sequence of the insert in plasmid 

15 pRF197c-42 is shown in SEQ ID NO: 11. The deduced amino 
acid sequence of SEQ ID NO: 11, shown in SEQ ID NO:12r is 
78.5% identical (90% similarity) to the castor 
microsomal delta-12 desaturase described above (SEQ ID 
NO: 10) and 66% identical (80% similarity) to the 

20 Ai-aHldopsis delta-12 desaturase amino acid sequence in 

SEQ ID NO:2. These similarities confirm that pRF197c-42 
is a castor bean seed cDNA that encodes a microsomal 
<ielta-12 desaturase or a microsomal delta-12 desaturase- 
related enzyme, such as a delta-1.2 hydroxylase. 

25 Specific PGR primers for pRF2-lC and pRF197c-42 were 

made. For pRF2-lc the upstream primer was bases 180 to 
197 of the CDNA sequence in SEQ ID Np:9. For pRF197c-42 
the upstream primer was bases 717 to 743 of the cDNA 
sequence in SEQ ID NO: 11. A common downstream primer 

30 was made corresponding * to the exact complement of the 
nucleotides 463 to 478 of the sequence described in SEQ 
ID NO: 9. Using RT-PCR with random hexamers and the 
above primers it was observed that the cDNA contained in 
PRF2-1C is expressed in both Stage I-II and Stage IV-V 

35 castor bean seeds whereas the cDNA contained in plasmid 
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pRF197c-42 is expressed only in Stage IV-V castor b an 
seeds, i.e., it is only expressed in tissue actively 
synthesizing ricinoleic acid. Thus, it is possible that 
this cDNA encodes a delta-12 hydroxylase. 
5 There is enough deduced amino acid sequence from 

the two castor sequences described in SEQ ID NOS:10 and 
12 to compare the highly conserved region corresponding 
to amino acids 311 to 353 of SEQ ID N0:2. When SEQ ID 
N0S:2, 4, 6, 8, and 10 are aligned by the Hein method 

10 described above the consensus sequence corresponds 

exactly to the amino acids 311 to 353 of SEQ ID N0:2. 
All of the seed microsomal delta-12 desaturases 
described above have a high degree of identity with the 
consensus over this region, namely ArabidQPSiS (100% 

15 identity), soybean (90% identity), corn (95% identity), 
canola (93% identity) and one (pRF2-lc) of the castor 
bean sequences (100% identity) . The other castor bean 
seed delta-12 desaturase or desaturase-related sequence 
(pRF197c-42) however has less identity with the 

20 consensus, namely 81% for the deduced amino acid 

sequence of the insert in pRF197c-42 (described in SEQ 
ID N0:12) . Thus while it remains possible that the 
insert in pRF197c-42 encodes a microsomal delta-12 
desaturase, this observation supports the hypothesis 

25 that it encodes a desaturase-related sequence, namely 
the delta-12 hydroxylase. 

An additional approach to cloning a castor bean 
seed delta-12 hydroxylase is the screening of a 
differential population of cDNAs. A lanibda-Zap 

30 (Stratagene) cDNA library made to polysomal mRNA 

isolated from developing castor bean endosperm (stages 
IV-V, 20-25 DAP) was screened with ^^p-i^beled cDNA made 
from polysomal mRNA isolated from developing castor bean 
endosperm (stage I-II, 5-10 DAP) and with 32p_iabeled 

35 cDNA made from polysomal mRNA isolated from developing 
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castor bean endosperm (stages IV-V^ 20-25 DAP) . The 
library was sere ned at a density of 2000 plaques per 
137 mm plate so that individual plaques were isolated. 
About 60/ 000 plaques were screened and plaques which 
5 hybridised with late (stage IV/V) cDNA but not early 
(stage 1/11) cDNA, which corresponded to about 1 in 
every 200 plaques, were pooled. 

The library of differentially expressed cDNAs may 
be screened with the castor delta-12 desaturase cDNA 

10 described above and/or with degenerate oligonucleotides 
based on secjuences of amino conserved among delta-12 
desaturases to isolate related castor cDNAs, including 
the cDNA encoding the delta-12 oleate hydroxylase 
enzyme. These regions of amino acid conservation may 

15 include, but are not limited to amino acids 101 to 109, 
137 to 145, and 318 to 327 of the auciino acid sequence 
described in SEQ ID NO: 2 or any of the sequences 
described in Table 7 below. Examples of such oligomers 
are SEQ ID N0S:13, 14, 16, and 17. The insert in 

20 plasmid pCF2-197c may be cut with Eco RI to remove 

vector sequences, purified by gel electrophoresis and 
labeled as described above . Degenerate oligomers based 
on the £J:>ove conserved amino acid sequences may be 
labeled with ^^P as described above. The cDNAs cloned 

25 from the developing endosperm difference library which 
do not hybridize with early mRNA probe but do hybridize 
with late mRNA probe and hybridize with either castor 
delta-12 desaturase cDNA or with an oligomer based on 
delta-12 desaturase sequences are likely to be the 

30 castor delta-12 hydroxylase. The pBluescript vector 

containing the putative hydroxylase cDNA can be excised 
and the inserts directly sequenced, as described above. 

. Clones such as pRF2-lC and pRF197c-42, and other 
clones from the differential screening, which, based on 

35 their DNA sequence, are less related to castor bean seed 
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microsomal delta-12 desaturases and are not any of the 
known fatty-acid desaturases described above or in 
WO 931124 5, may be expressed/ for example, in soybean 
embryos or another suitable plant tissue, or in a 
5 microorganism, such as yeast, which does not normally 
contain ricinoleic acid, using suitable expression 
vectors and transformation protocols. The presence of 
novel ricinoleic acid in the transformed tissue (s) 
expressing the castor cDNA would confirm the identity of 
10 the castor cDNA as DNA encoding for an oleate 

hydroxylase. 

^^qu&nc& Comparison s Among Seed Microsomal 
pftlta-12 ni>fiat-iiraaes 
The percent overall identities between coding 

15 regions of the full-length nucleotide sequences encoding 
microsomal delta-12 desaturases was determined by their 
alignment by the method of Needleman et al. (J. Mol. 
Biol. (1970) 48:443-453) using gap weight and gap length 
weight values of 5.0 and 0.3 (Table 4). Here, a2, c2, 

20 s2, z2 and des A refer, respectively, to the nucleotide 
sequences encoding microsomal delta-12 fatty acid 
desaturases from Ar-at^idopsls (SEQ ID NO:l), Brassica 
napus (SEQ ID NO:3), soybean (SEQ ID NO:5), corn (SEQ ID 
NO: 7), and cyanobacterial des A, whereas r2 refers to 

25 the microsomal delta-12 desaturase or desaturase-related 
enzyme from castor bean (SEQ ID NO: 12) • 

TABLE 4 

Percent Identity Between the Coding Regions of 
Nucleotide Sequences Encoding Different Microsomal . 

n^lt:a->12 Fatty Acid Desaturases 



£2. &2. 22. dgg A 

a2 84 66 64 43 

c2 - 65 62 42 

s2 - - 62 42 
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The overall relatedness between the deduced amino 
acid sequences of microsomal delta-12 desaturases or 
desaturase-related enzymes of the invention (i.e., SEQ 
ID NOS:2, 4, S, 8, and 12) determined by their alignment 
5 by the method of Needleman et al. (J. Mol. Biol. (1970) 
48:443-453) using gap weight and gap length weight 
values of 3.0 and O.lr respectively, is shown in 
Table 5. Here a2, c2, s2, z2 and des A refer, 
respectively, to microsomal delta-12 fatty acid 

10 desaturases from AC3h±^QSt&l& (SEQ ID N0:2), Brassica 

na pus (SEQ ID N0:4), soybean (SEQ ID N0:6), corn (SEQ ID 
N0:8), and cyanobacterial des A, whereas r2 refers to 
the microsomal desaturase or desaturase-related enzyme 
from castor bean (SEQ ID N0:12) . The relatedness 

15 between the enzymes is shown as percent overall 
identity /percent overall similarity. 



TABLE 5 

Relatedness Between Different Microsomal 
np>li-a-i:> Fatty Acid Desaturases 





Si 


aZ 


sZ 


ZZ 


des A 


a2 


84/89 


70/85 


66/80 


71/83 


24/50 


c2 




67/80 


63/76 


69/79 


24/51 


s2 






67/83 


66/82 


23/49 


r2 








61/78 


24/51 


z2 










25/49 



The high degree of overall identity (60% or 
greater) at the the amino acid levels between the 

20 Biragsiea napus , soybean, castor and corn enzymes with 
that of ArabidQpsls microsomal delta-12 desaturase and 
their lack of an N-terminal extension of a transit 
peptide expected for a nuclear-encoded chloroplast 
desaturase leads Applicants to conclude that SEQ ID 

25 NOS:4, 6, 8, 10, and 12 encode the microsomal delta-12 
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desaturases or desaturase-related enzymes. Further 
confirmation of this identity will come from biological 
function, that is, by analyzing the phenotype of 
transgenic plants or other organisms produced by using 
5 chimeric genes incorporating the above-mentioned 
sequences in sense or antisense orientation, with 
suitable regulatory sequences. Thus, one can isolate 
cDNAs and genes for homologous fatty acid desaturases 
from the same or different higher plant species, 

10 especially from the oil-producing species. Furthermore, 
based on these comparisons, the Applicants expect all 
higher plant microsomal delta-12 desaturases from all 
higher plants to show an overall identity of 60% or more 
and to be able to readily isolate homologous fatty acid 

15 desaturase sequences using SEQ ID N0S:1, 3, 5, 7, 9, and 
11 by secnience-dependent protocols. 

The overall percent identity at the amino acid 
level,* using the above alignment method, between 
selected plant desaturases is illustrated in Table 6. 



TftBLE 6 

Percent Identity Between Selected Plant Fatty Acid 
Dg^fiaturas^s at thft Amino Acid Level 







as. 


£1 


S£D. 


S2. 


a2 


38 


33 


38 


35 


34 


a3 




65 


93 


66 


67 


aD 






66 


87 


65 


c3 








67 


67 


CD 










65 


In 


Table 6, a2. 


a3/ ad. 


c3, CD, 


and S3 


refer. 



respectively, to SEQ ID NO: 2, Arabidopsis microsomal 
delta-15 desaturase, Arabidopsis plastid delta-15 
20 desaturase, canola microsomal delta-15 desaturase, 
canola plastid delta-15 desaturase, and soybean 
microsomal delta-15 desaturase. Based on these 
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comparisons r the delta-15 d saturases, of both 
microsomal and plastid types, have overall identities of 
65% or more at the amino acid level, even when from the 
same plant species. Based on the above the Applicants 
5 expect microsomal delta- 12 desaturases from all higher 
plants to show similar levels of identity (that is, 60% 
or more identity at the cunino acid level) and that SEQ 
ID NOS:l, 3, 5, 7, and 9 can also be used as 
hybridization probe to isolate homologous delta-12 
10 desaturase sequences, and possibly for sequences for 
fatty acid desaturase-related enzymes, such as oleate 
hydroxylase, that have an overall amino acid homology of 
50% or more. 

Similar alignments of protein seqnjences of plant 
15 microsomal fatty acid delta-12 desaturases [SEQ ID 
NOS:2, 4, 6, and 8] and plant delta-15 desaturases 
[microsomal and plastid delta-15 desaturases from 
Arabidopsis and Brassica naEU^/ WO 9311245] allows 
identification of amino acid sequences conserved between 
20 the different desaturases (Table 7) . 

Amino Add Sequences Conserved Between 
Plant Microsomal Delta-12 Desaturases and Microsomal and 
Plastid Delta-15 Desaturases 



Region 


Conserved AA 
Positions in 
SEQIDNO:2 


Consensus 
Conserved AA 

Sequence in 
A^^Desatarases 


Consensus 
Conserved AA 

Sequence in 
A^^Desaturases 


Consensus 
. AA Sequence 


A 


39-44 


AIPPHC 




AIP(P/K)HC 


B 


86-90 


WPOTDYW 


WPLYW 


WP(L/DYW 


C 


104-109 




GHD^Gil 


(A/G)H(D/E)CGH 


D 


130-134 


LLVPY 


ILVFy 


(L/I)LVPY 


E 


137-142 


WKYSHR 


WRISHR 


W(K/R)(Y/I)SHR 


F 


140-145 


SHRRHH 


SHRTHH 


SHR(R/DHH 


G 


269-274 






Q/V)TYUQfH) 


H 


279-282 




1£WX 


LP(H/W)Y 
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I 289-294 WLCB/K)QAL YIJLQQh (W/Y)L(R/K)G(A/G)L 

J 296-302 T VDRDYG TLDRDYG T(V/L)DRDYG 

K 314-321 THVAHHLF THVIHHLP THV(A/I)HHLF 

L 318-327 HHLFS TMEHY ISS^PQIEiD: 

HHFL(S/P) 
(T/QXVM)PHY 

T€J>le 7 shows twelve regions of conserved amino 
acid sequences, designated A-L (column 1), whose 
positions in SEQ ID NO: 2 are shown In column 2. The 
5 consensus sequences for these regions in plant delta-12 
fatty acid desaturases and plant delta-15 fatty acid 
desaturases are shown in columns 3 and A, respectively; 
amino acids are shown by standard abbreviations , the . 
underlined amino acids are conserved between the 
10 delta-12 and the delta-15 desaturases, and amino acids 
in brackets represent substitutions found at that 
position. The consensus sequence of these regions are 
shown in column 5. " These short conserved amino acids 
and their relative positions further confirm that the 
15 isolated isolated cDNAs encode a fatty acid desaturase, 
T5;nlatiiQn of Nucleotide Sequences Encodlnqr 
HomQlQCOiiR and HeUerologoufi Faf.ty acid Desaturases 
anrt Degaturasft-llke Enzvmes 
Fragments of the Instant invention may be used to 
20 Isolate cDNAs and genes of homologous and heterologous 
fatty acid desaturases from the same species as the 
fragments of the invention or from different species. 
Isolation of homologous genes using sequence-dependent 
protocols is well-known in the art and Applicants have 
25 demonstrated that Arabidopsls microsomal delta-12 

desaturase cDNA sequence can be used to Isolate cDNA for 
related fatty acid desaturases from Brassica naszu&f 
soybean, corn and castor bean. 

More importantly, one Ccui.use the fragments 
30 containing SEQ ID NOS:!, 3, 5, 1, and 9 or their 
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smaller, more conserved regions to isolate novel fatty 
acid desaturases and fatty acid desaturase-related 
enzymes • 

In a particular embodiment of the present 
5 invention, regions of the nucleic acid fragments of the 
invention that are conserved between different 
desaturases may be used by one skilled in the art to 
design a mixture of degenerate oligomers for use in 
sequence-dependent protocols aimed at isolating nucleic 

10 acid fragments encoding homologous or heterologous fatty 
acid desaturase cDNA's or genes. For example, in the 
polymerase chain reaction (Innis, et al., Eds, (1990) 
PGR Protocols: A Guide to Methods and Applications, 
Academic Press, San Diego), two short pieces of the 

15 present fragment of the invention can be used to amplify 
a longer fatty acid desaturase DNA fragment from DNA or 
RNA. The polymerase chain reaction may also be 
performed on a library of cloned nucleotide sequences 
with one primer based on the fragment of the invention 

20 and the other on either the poly A+ tail or a vector 
sequence. These oligomers may be unique sequences or 
degenerate sequences derived from the nucleic acid 
fragments of the invention. The longer piece of 
homologous fatty acid desaturase DNA generated by this 

25 method could then be used as a probe for isolating 
related fatty acid desaturase genes or cDNAs from 
Arabidopsis or Other species and subsequently identified 
by differential screening with known desaturase 
sequences and by nucleotide sequence determination. The 

30 design of oligomers, including long oligomers using 

deoacyinosine, and "guessmers" for hybridization or for 
the polymerase chain reaction are known to one skilled 
in the art and discussed in Sambrook et al • , (Molecular 
Cloning, A Laboratory Manual, 2nd ed. (1989), Cold 

35 Spring Harbor Laboratory Press) . Short stretches of 
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amino acid sequences that are conserved between 
cyanobacterial delta-12 desaturase (Wada et al,r Nature 
(1990) 347:200-203) and plant delta-15 desaturases 
[WO 9311245] were previously used to make oligo- 
5 nucleotides that were degenerate and/or used 

deoxyinosine/s. One set of these oligomers made to a 
stretch of 12 amino acids conserved between 
cyanobacterial delta-12 desaturase and higher plant 
delta-15 desaturases was successful in cloning the 

10 plastld delta-12 desaturase cDNAs; these plant 
desaturases have more than 50% identity to the 
cyanobacterial delta-12 desaturase. Some of these 
oligonucleotides were also used as primers to make 
polymerase chain reaction (PGR) products using poly A+ 

15 RNA from plants. However^ none of the oligonucleotides 
and the PGR products were successful as radiolabeled 
hybridization probes in isolating nucleotide sequences 
encoding microsomal delta- 12 fatty acid desaturases. 
Thus, as expected/ none of the stretches of four or more 

20 amino acids conserved between Arat^-idopsis delta-12 and 
Arabidopsls delta-15 desaturases are identical in the 
cyanobacterial desaturase, just like none of the 
stretches of four or more amino acids conserved between 
ft rf^Hirinpftls delta-15 and the cyanobacterial desaturase 

25 are identical in SEQ ID NO: 2. Stretches of conserved 
amino acids between the present invention and delta-15 
desaturases now allow for the design of oligomers to be 
used to isolate sequences encoding other desaturases and 
desaturase-related enzymes. For example, conserved 

30 stretches of amino acids between delta-12 desaturase and 
delta-15 desaturiase, shown in Table 1, are useful in 
designing long oligomers for hybridization as well as 
shorter ones for use as primers in the polymerase chain 
reaction. In this regard, sequences conserved between 

35 delta-12 and delta-15 desaturases (shown in Table 7) 
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would be particularly useful. The consensus sequences 
will also take into account conservative substitutions 
known to one skilled in the art^ such as Lys/Arg, 
Glu/Asp, Ile/Val/Leu/Met/ Ala/Glyr Gln/Asn, and Ser/Thr. 
5 Amino acid sequences as short as four amino acids long 
can successfully be used in PGR [Nunberg et. al. (1989) 
Journal of Virology 63:3240-3249]. Amino acid sequences 
conserved between delta7-12 desaturases (SEQ ID N0S:2, 4, 
6, B, and 10) may also be used in sequence-dependent 

10 protocols to isolate fatty acid desaturases and fatty 
acid desaturase-related enzymes expected to be more 
related to delta-12 desaturases, such as the oleate 
hydroxylase from castor bean. Particularly useful are 
conserved sequences in column 3 (Table 7), since they 

15 are also conserved well with delta-15 desaturases 
(column 4, Table 7) . 

Determining the conserved amino acid sequences from 
diverse desaturases will also allow one to identify more 
and better consensus sequences that will further aid in 

20 the isolation of novel fatty acid desaturases, including 
those from non-plant sources such as fungi, algae 
(including the desaturases involved in the desaturations. 
of the long chain n-3 fatty acids), and even cyano- 
bacteria, as well as other membrane-associated 

25 desaturases from other organisms. 

The function of the diverse nucleotide fragments 
encoding fatty acid desaturases or desaturase-related 
enzymes that can be isolated using the present invention 
can be identified by transforming plants with the 

30 isolated sequences, linked in sense or antisense 

orientation to suitable regulatory sequences required 
for plant expression, and observing the fatty acid 
phenotype of the resulting transgenic plants . Preferred 
target plants for the transformation are the same as the 

35 * source of the isolated nucleotide fragments when the 
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goal is to obtain inhibition of the corresponding 
endogenous gene by antisense inhibition or 
cosuppression. Preferred target plants for use in 
expression or overexpression of the isolated nucleic 
5 acid fragments are wild type plants or plants with known 
mutations in desaturation reactions/ such as the 
AT>abidQPsis mutants £ad&, lada# f^dH, £ad2L/ and 

iadS; mutant flax deficient in delta-15 desaturation; or 
mutant sunflower deficient in delta-12 desaturation. 

10 Alternatively^ the function of the isolated nucleic acid 
fragments can be determined similarly via transformation 
of other organisms ^ such as yeast or cyanobacteria, with 
chimeric genes containing the nucleic acid fragment and 
suitable regulatory sequences followed by analysis of 

15 fatty acid conqposition and/or enzyme activity. 

Ov^T-expr^'^aiQn of the Fattv Acid 
n^fiatura-cs^ Enzymes in Transgenic Species 
The nucleic acid fragment (s) of the instant 
invention encoding functional fatty acid desaturase (s) , 

20 with suitable regulatory sequences, can be used to 

overexpress the enzyme (s) in transgenic organisms. An 
example of such expression or overexpression is 
demonstrated by transformation of the Arabldopsia mutant 
lacking oleate desaturation. Such recombinant DNA 

25 constructs may include either the native fatty acid 

desaturase gene or a chimeric fatty acid desaturase gene 
isolated from the same or a different species as the 
host organism. For overexpression of fatty acid 
desaturase (s) , it is preferable that the introduced gene 

30 be from a different species to reduce the likelihood of 
cosuppression. For example, overexpression of delta-12 
desaturase in soybean, rapeseed, or other oil-producing 
species to produce altered levels of polyunsaturated 
fatty acids may be achieved by expressing RNA from the 

35 • full-length cDNA found in p92103, pCF2-165D, and 



wo 94/11516 



PCr/US93/09987 



pSF2-169K. Transgenic lines ov rexpressing the cielta-12 
desaturase, when crossed with lines overexpreasing 
delta-15 desaturases, will result in ultrahigh levels of 
18:3. Similarly r the isolated nucleic acid fragments 
5 encoding fatty acid desaturases from Arabidopsls. 

rapeseed, and soybeem can also be used by one skilled in 
the art to obtain other stibstantially homologous full- 
length cDNAs, if not already obtained, as well as the 
corresponding genes as fragments of the invention. 

10 These, in turn, may be used to overexpress the 

corresponding desaturases in plants. One skilled in the 
art can also isolate the coding sequence <s) from the 
fragment (s) of the invention by using and/or creating 
sites for restriction endonucleases, as described in 

15 Sambrook et al., (Molecular Cloning, A Laboratory 

Manual, 2nd ed. (1989), Cold Spring Harbor Laboratory 
Press) . 

One particularly useful application of the claimed 
inventions is to repair the agronomic performance of 

20 plant mutants containing ultra high levels of oleate in 
seed oil. In Arabidopsis reduction in linoleate in 
phosphatidylcholine due to a mutation in microsomal 
delta-12 desaturase affected low temperature survival 
[Miguel, M. et. al. (1993) Proc. Natl Acad. Sci. USA 

25 90:6208-6212]. Furthermore, there is evidence that the 
poor agronomic performance of canola plants containing 
ultra high (>80%) levels of oleate in seed is due to 
mutations in the microsomal delta-12 desaturase genes 
that reduce the level of linoleate in phosphotidyl- 

30 choline of roots and leaves. That is, the mutations are 
not seed-specific. Thus, the root and/or leaf -specif ic 
expression (that is, with no expression in the seeds) of 
microsomal delta-12 desaturase activity in mutants of 
oilseeds with ultra-high levels of oleate in seed oil 
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will result in agronomically-improved mutant plants with 
ultra high levels of oleate in seed oil. 

Tnh^hTl-iQn of Plant Target 
p^n^ft by Use of Antlaenae RNA 

5 Antisense RNA has been used to inhibit plant target 

genes in a tissue-specific manner (see van der Krol et 
al,r Biotechniques (1988) 6:958-976). Antisense 
inhibition has been shown using the entire cDNA sequence 
(Sheehy et al., Proc. Natl. Acad. Sci. USA (1988) 

10 85:8805-8809) as well as a partial cDNA sequence (Cannon 
et al.. Plant Molec. Biol. (1990) 15:39-47). There is 
also evidence that the 3 ' non- coding sequences (Ch'ng 
et al.^ Proc. Natl. Acad. Sci. USA (1989) 
86:10006-10010) and fragments of 5* coding sequence, 

15 containing as few as 41 base-pairs of a 1.87 )cb cDNA 

(Cannon et al.. Plant Molec. Biol. (1990) 15:39-47), can 
play important roles in antisense inhibition. 

The use of antisense inhibition of the fatty acid 
desaturases may require isolation of the transcribed 

20 sequence for one or more target fatty acid desaturase 
genes that are expressed in the target tissue of the 
target plant. The genes that are most highly expressed 
are the best targets for antisense inhibition. These 
genes may be identified by determining their levels of 

25 transcription by techniques^ such as quantitative 
analysis of mRNA levels or nuclear run-off 
transcription, known to one skilled in the art. 

The entire soybean microsomal delta-12 desaturase 
cDNA was cloned in the antisense orientation with 

30 respect to either soybean b-conglycinin, soybean KTi3, 
and bean phaseolin promoter and the chimeric gene 
transformed into soybean somatic embryos that were 
previously shown to serve as good model system for 
soybean zygotic embryos and are predictive of seed 

35 composition (Table 11) . Transformed somatic embryos 
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showed inhibition of linoleate biosyntheis. Similarly^ 
the entire Brassica na pus microsomal delta-12 desaturase 
cDNA was cloned in the antisense orientation with 
respect to a rapeseed napin promoter and the chimeric 
5 gene transformed into fi. ziacua. Seeds of transformed 
B,. napus plants showed inhibition of linoleate 
biosynthesis. Thus, antisense inhibition of delta-12 
desaturase in oil-producing species, including corn, 
Braaaica napus , and soybean resulting in altered levels 

10 of polyunsaturated fatty acids may be achieved by 

expressing antisense RNA from the entire or partial cDNA 
encoding microsomal delta-12 desaturase. 

Inhibition of Plant' 
'f'arget Genes bv Cosuppression 

15 The phenomenon of cosuppression has also been used 

to inhibit plant target genes in a tissue-specific 
manner. Cosuppression of an endogenous gene. using the 
entire cDNA sequence (Napoli et al.. The Plant Cell 
(1990) 2:279-289; van der Krol et al.. The Plant Cell 

20 (1990) 2:291-299) as well as a partial cDNA sequence 
(730 bp of a 1770 bp cDNA) (Smith et al., Mol. Gen. 
Genetics (1990) 224:477-481) are known. 

The nucleic acid fragments of the instant invention 
encoding fatty acid des^turases, pr parts thereof, with 

25 suitable regulatory sequences, can be used to reduce the 
level of fatty acid desaturases, thereby altering fatty 
acid composition, in transgenic plants which contain an 
endogenous gene substantially homologous to the 
introduced nucleic acid fragment . The experimental 

30 procedures necessary for this are similar to those 

described above for the overexpression of the fatty acid 
desaturase nucleic acid fragments except that one may 
also use a partial cDNA sequence. For example, 
cosuppression of delta-12 desaturase in PrassAca nasus. 

35 and soybean resulting in altered levels of 
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polyunsaturated fatty acids may be achieved by 
expressing in the sense orientation the entire or 
partial seed delta-12 desaturase cDNA found in pCF2-165D 
and pSF2-165K, respectively. Endogenous genes can also 
5 be inhibited by non-coding regions of an introduced copy 
of the gene [For example, Brusslan, J. A. et al. (1993) 
Plant Cell 5:667-677; Matzke, M. A. et al.. Plant 
Molecular Biology 16:821-830]. We have shown that an . 
Arabidopsis gene (SEQ ID NO: 15) cbrresponding to the 

10 cDNA (SEQ ID N0:1) can be isolated. One skilled in the 
art can readily isolate genomic DNA containing or 
flanking the genes and use the coding or non-coding 
regions in such transgene inhibition methods • 

Analysis of the fatty acid con^josition of roots and 

15 seeds of Arabidopsis mutants deficient in microsomal 
delta-12 desaturation shows that they have reduced 
levels of 18:2 as well as reduced levels of 16:0 (as 
much as 40% reduced level in mutant seeds as compared to 
wild type seeds) [Miquel and Browse (1990) in Plant 

20 Lipid Biochemistry, Structure, and Utilization, 

pages 456-458, Ed. Quinn, P. J. and. Harwood, J. L., 
Portland Press. Reduction in the level of 16:0 is also 
observed in ultra high oleate mutants of fi. ixaGUJs.- 
Thus, one can expect that ultra high level of 18:1 in 

25 transgenic plants due to antisense inhibition or co- 

supression using the claimed sequences will also reduce 
the level of 16:0. 

fi^lection of Hosts, Promoters and Enhancers 

A preferred class of heterologous hosts for the 
30 expression of the nucleic acid fragments of the 

invention are eukaryotic hosts, particularly the cells . 
of higher plants. Particularly preferred among the 
higher plants are the oil-producing species, such as 
soybean ^ Glycine max > ^ rapeseed (including Brassica 
35 xiSBiX&f a. canape St r is > , sunflower (]BeJ,jianthUS ftflnus) r 
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cotton ( Gossypium hirsu-huTn) , corn (Zfia mays) , cocoa 
f Theobroma Q^siSSl) 9 saf flower fCarth^mus tlnctorlus) , oil 
palm miaeis gnineensisi , coconut palm (Cqcqs nuclfera) f 
flax (liixmm usitiatlssimum ) , and peanut (Arachis 
5 hypogaea) . 

Expression In plants will use regulatory sequences 
functional in such plants. The expression of foreign 
genes in plants is well-established (De Blaere et al., 
Meth. Enzymol. (1987) 153:277-291), The source of the 

10 promoter chosen to drive the expression of the fragments 
of the invention is not critical provided it has 
sufficient transcriptional activity to accomplish the 
invention by increasing or decreasing, respectively, the 
level of translatable roRNA for the fatty acid 

15 desaturases in the desired host tissue. Preferred 
promoters include (a) strong constitutive plant 
promoters, such as those directing the 19S and 35S 
transcripts in cauliflower mosaic virus (Odell et al . , 
Nature (1985) 313:810-812; Hull et al.. Virology (1987) 

20 86:482-493)., (b) tissue- or developmentally-specif ic 
promoters, and (c) other transcriptional promoter 
systems engineered in plants, such as those using 
bacteriophage T7 RNA polymerase promoter sequences to 
express foreign genes. Exan^les of tissue-specific 

25 promoters are the light-inducible promoter of the small 
subunit of ribulose 1, 5-bis-phosphate carboxylase (if 
expression is desired in photosynthetic tissues) , the 
maize zein protein promoter (Matzke et al., EMBO «J . 
(1984) 3:1525-1532), and the chlorophyll a/b binding 

30 protein promoter (Lampa et al.. Nature (1986) 
316:750-752) . 

Particularly preferred promoters are those that 
allow seed-specific expression- This may be especially 
useful since seeds are the primary source of vegetable 

35 oils and also since seed-specific expression will avoid 
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any potential deleterious effect in non-seed tissues. 
Examples of seed-specific promoters. include, but are not 
limited to, the promoters of seed storage proteins, 
which can represent up to 90% of total seed protein in 
5 many plants. The seed storage proteins are strictly 
regulatedr being expressed almost exclusively in seeds 
in a highly tissue-specific and stage-specific manner 
(Higgins et al., Ann. Rev. Plant Physiol. (1984) 
35:191-221; Goldberg et al.. Cell (1989) 56:149-160). 

10 Moreover, different seed storage proteins may be 
expressed at different stages of seed development. 

Expression of seed-specific genes has been studied 
in great detail (See reviews by Goldberg et al.. Cell 
(1989) 56:149-160 and Higgins et al., Ann. Rev. Plant 

15 Physiol. (1984) 35:191-221). There are currently 

numerous examples of seed-specific expression of seed 
storage protein genes in transgenic dicotyledonous 
plants. These include genes from dicotyledonous plants 
for bean b-phaseolin (Sengupta-iSopalan et al . , Proc. 

20 Natl. Acad.. Sci. USA (1985) 82:3320-3324; Hof fman. et 
al.. Plant Mol. Biol. (1988) 11:717-729), bean lectin 
(Voelker et al., EMBO J. (1987) 6:3571-3577), soybean 
lectin (Okamuro et al., Proc. Natl. Acad. Sci. USA 
(1986) 83:8240-8244), soybeem Kunitz trypsin inhibitor 

25 (Perez-Grau et al.. Plant Cell (1989) 1:095^1109), 

soybean b-conglycinin (Beachy et al., EMBO J. (1985) 
4:3047-3053; pea vicilin (Higgins et al.. Plant Mol. 
Biol. (1988) 11:683-695), pea convicilin (Newbigin et 
al., Planta (1990) 180:461-470), pea legumin (Shirsat et 

30 al., Mol. Gen. Genetics (1989) 215:326-331); rapeseed 
napin (Radke et al., Theor. Appl. Genet. (1988) 
75:685-694) as well as genes from monocotyledonous 
plants such as for maize 15 kD zein (Hoffman et al., 
EMBO J. (1987) 6:3213-3221), maize 18 kD oleosin (Lee at 

35 al., Proc. Natl. Acad. Sci. USA (1991) 888:6181-6185), 
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barley b-hordein (Harris et al.. Plant Mol. Biol. (1988) 
10:359-366) and wheat glutenin (Colot et al.^ EMBO J. 
(1987) 6:3559-3564), Moreover, promoters of seed- 
specific genes operably linked to heterologous coding 
5 sequences in chimeric gene constructs also maintain 
their temporal and spatial expression pattern in 
transgenic plants. Such examples include use of 
Ayahidopsis thallana 2S seed Storage protein gene 
promoter to express enkephalin peptides in Arabidopsis 

10 and B. na pus seeds (Vande)cerckhove et al., 

Bio/Technology (1989) 7:929-932), bean lectin and bean 
b-phaseolin promoters to express lucif erase (Riggs et 
al.. Plant Sci. (1989) 63:47-57), and wheat glutenin 
promoters to express chloramphenicol acetyl transferase 

15 (Colot et al,, EMBO J. (1987) 6:3559-3564). 

Of particular use in the expression of the nucleic 
acid fragment of the invention will be the heterologous 
promoters from several soybean seed storage protein 
genes such as those for the Kunitz trypsin inhibitor 

20 (Jofuku et al.. Plant Cell (1989) 1:1079-1093; glycinin 
(Nielsen et al.. Plant Cell (1989) 1:313-328), and 
b-conglycinin (Harada et al.. Plant Cell (1989) 
1:415-425). Promoters of genes for a- and b-subunits of 
soybean b-conglycinin storage protein will be - 

25 particularly useful in expressing the mRNA or the 

antisende RNA in the cotyledons at mid- to late-stages 
of seed development (Beachy et al,, EMBO J. (1985) 
4:3047-3053) in transgenic plants. This is because 
there is very little position effect on their expression 

30 in transgenic seeds, arid the two promoters show 

different temporal regulation. The promoter for the 
a-subunit gene is expressed a few days before that for 
the b-subunit gene. This is important for transforming 
rapeseed where oil biosynthesis begins about a week 
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before seed storage protein synthesis (Murphy et al., 
J. Plant Physiol. (1989) 135:63-69). 

Also of particular use will be promoters of genes 
expressed during early embryogenesis and oil bio- 
5 synthesis. The native regulatory sequenceSf including 
the native promoters, of the fatty acid desaturase genes 
expressing the nucleic acid fragments of the invention 
can be used following their isolation by those skilled 
in the art. Heterologous promoters from other genes 

10 involved in seed oil biosynthesis/ such as those for 

£. na pus isocitrate lyase and malate synthase (Comai et 
al.. Plant Cell (1989) 1:293-300)/ delta-9 desaturase 
from saf flower (Thompson et al. Proc. Natl. Acad. Sci^ 
USA (1991) 88:2578-2582) and castor (Shanklin et al., 

15 Proc. Natl. Acad. Sci. USA (1991) 88:2510-2514), acyl 
carrier protein (ACP) from Arabldopsis (Post- 
Beittenmiller et al., Nucl. Acids Res. (1989) 17:1777), 
fi. na pus (Saf ford et al., Eur. J. Biochem. (1988) 
174:287-2 95), and fi. campestris (Rose et al., Nucl. 

20 Acids Res. (1987) 15:7197), b-ketoacyl-ACP synthetase 

from barley (Siggaard-Andersen et al., Proc. Natl. Acad. 
Sci. USA (1991) 88:4114-4118), and oleosin from Zea may& 
(Lee et al., Proc. Natl. Acad. Sci. USA (1991) 
88:6181-6185), soybean (Genbank Accession No: X60773) 

25 and fi. na pus (Lee et al.. Plant Physiol. (1991) 

96:1395-1397) will be of use. If the sequence of the 
corresponding genes is not disclosed or their promoter 
region is not identified, one skilled in the art can use 
the published sequence to isolate the corresponding gene 

30 and a fragment thereof containing the promoter. The 
partial protein sequences for the relatively-abundant 
enoyl-ACP reductase and acetyl-CoA carboxylase are also 
published (Slabas et al., Biochim. Biophys. Acta (1987) 
877:271-280; Cottingham et al., Biochim. Biophys. Acta 

35 (1988) 954:201-207) and one skilled in the art can use 
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these sequences to isolate the corresponding seed genes 
with their promoters. Similarly^ the fragments of the 
present invention encoding fatty acid desaturases can be 
used to obtain promoter regions of the corresponding 
5 genes for use in expressing chimeric genes. 

Attaining the proper level of expression of the 
nucleic acid fragments of the invention may require the 
use of different chimeric genes utilizing different 
promoters. Such chimeric genes can be transferred into 

10 host plants either together in a single expression 
vector or sequentially using more than one vector. 

It is envisioned that the introduction of enhancers 
or enhancer-like elements into the promoter regions of 
either the native or chimeric nucleic acid fragments of 

15 the Invention will result in increased expression to 
accomplish the Invention. This would Include viral 
enhancers such as that found in the 35 S promoter (Odell 
et al.. Plant Mol. Biol. (1988) 10:263-272), enhancers 
from the opine genes (Fromm et al.. Plant Cell (1989) 

20 1:977-984), or enhancers from any other source that 
result in increased transcription when placed into a 
promoter operably linked to the nucleic acid fragment of 
the invention. 

Of particular Inqportance Is the DN A sequence 

25 element Isolated from the gene for the a-subunit of 
b-conglycinin that can confer 40-fold seed-specific 
enhancement to a constitutive promoter (Chen et al.. 
Dev. Genet. (1989) 10:112-122). One skilled in the art 
can readily isolate this element and insert it within 

30 the promoter region of any gene in order to obtain seed- 
specific enhanced expression with the promoter in 
transgenic plants. Insertion of such an element in any 
seed-speclf Ic gene that is expressed at different times 
than the b-conglyclnln gene will result In expression In 
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transgenic plants for a longer period during seed 
development. 

The invention can also be accomplished by a variety 
of other methods to obtain the desired end. In one 
5 formr the invention is based on modifying plants to 
produce increased levels of fatty acid desaturases by 
virtue of introducing more than one copy of the foreign 
gene containing the nucleic acid fragments of the 
invention. In some cases, the desired level of 
10 polyunsaturated fatty acids may reqpaire introduction of 
foreign genes for more than one kind of fatty acid 
desaturase . 

Any 3* non-coding region capable of providing a 
polyadenylation signal and other regulatory sequences 

15 that may be required for the proper expression of the 
nucleic acid fragments of the Invention can be used to 
accomplish the invention. This would include 3' ends of 
the native fatty acid desaturase (s) ^ viral genes such as 
from the 35S or the 19S cauliflower mosaic virus 

20 transcripts, from the opine synthesis genes, ribulose 

1, 5-bisphosphate carboxylase, or chlorophyll a/b binding 
protein. There are numerous examples in the art that 
teach the usefulness of different 3' non-coding regions. 

Transformation Methods 

25 Various methods of transforming cells of higher 

plants according to the present invention are available 
to those skilled in the art (see EPO Pub. 0 295 959 A2 
and 0 318 341 Al) . Such methods include those based on 
transformation vectors utilizing the Ti and Ri plasmids 

30 of Agyrohagterium 5pp. It is particularly preferred to 
use the binary type of these vectors. Ti-derived 
vectors transform a wide variety of higher plants, 
including monocotyledonous and dicotyledonous plants 
(Sukhapinda et al.. Plant Mol. Biol. (1987) 8:209-216; 

35 Potrykus, Mol. Gen. Genet. (1985) 199:183). Other 
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transformation methods are available to those skilled in 
the art, such as direct uptake of foreign DNA constructs 
(see EPO Pub. 0 295 959 A2) / techniques of electro- 
poration (Fromm et al.. Nature (1986) (London) 319:791) 
5 or high-velocity ballistic bombardment with metal 

particles coated with the nucleic acid constructs (Kline 
et al.. Nature (1987) (London) 327:70). Once 
transformed, the cells can be regenerated by those 
skilled in the art . 

10 Of particular relevance are the recently described 

methods to transform foreign genes into commercially 
important crops, such as rapeseed (De Block et al.. 
Plant Physiol. (1989) 91:694-701), sxuiflower (Everett et 
al;, Bio/Technology (1987) 5:1201), and soybean 

15 (Christou et al,, Proc. Natl, Acad. Sci USA (1989) 
86:7500-7504. 

Application to Molecular Breeding 

The 1 . 6 kb insert obtained from the plasmid 
pSF2-169K was used as a radiolabelled probe on a 

20 Southern blot containing genomic DNA from soybean 
/ Glycine max (cultivar Bonus) and Glycitie so ja 
(PI81762) ) digested with one of several restriction 
enzymes. Different patterns of hybridization 
(polymorphisms) were identified in digests performed 

25 with restriction enzymes Hind III and Eco RI . These 
polymorphisms were used to map two pSF2-169 loci 
relative to other loci on the soybean genome essentially 
as described by Helentjaris et al., (Theor. Appl, Genet, . 
(1986) 72:761-769). One mapped to linkage group 11 

30 between 4404.00 and 1503.00 loci (4.5 cM and 7.1 cM from 
4404.00 and 1503.00, respectively) and the other to 
linkage group 19 between 4010.00 and 5302.00 loci 
(1.9 cM and 2.7 cM from 4010,00 and 5302,00, 
respectively) [Rafalski, A and Tingey, S. (1993) in 

35 Genetic Maps, Ed, O' Brien, S. J.]. The use of 
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restriction fragment length polymorphism (RFLP) markers 
in plant breeding has been well-documented in the art 
(Tanksley et al., Bio/Technology (1989) 7:257-264). 
Thus^ the nucleic acid fragments of the invention can be 
5 used as RFLP markers for traits linked to expression of 
fatty acid desaturases. These traits will include 
altered levels of unsaturated fatty acids. The nucleic 
acid fragment of the invention can also be used to 
isolate the fatty acid desaturase gene from variant 

10 (including mutant) plants with altered levels of 

unsaturated fatty acids. Secjuencing of these genes will 
reveal nucleotide differences from the normal gene that 
cause the variation. Short oligonucleotides designed 
around these differences may also be used in molecular 

15 breeding either as hybridization probes or in DNA-based 
diagnostics to follow the variation in fatty acids. 
Oligonucleotides based on differences that are linked to 
the variation may be used as molecular markers in 
breeding these variant oil traits. 

20 EXAMPLES 

The present invention is further defined in the 
following Examples , in which all parts and percentages 
are by weight and degrees are Celsius, unless otherwise 
stated. It should be understood that these Exanples, 

25 while indicating preferred embodiments of the invention, 
are given by way of illustration only. From the above 
.discussion and these Examples, one skilled in the art 
can ascertain the essential characteristics of this 
invention, and without departing from the spirit and 

30 scope thereof, can make various changes and 

modifications of the invention to adapt it to various 
usages and conditions. All piablications, including 
patents and non-patent literature, referred to in this 
specification are expressly incorporated by reference 

35 herein. 
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EXAMPLE 1 

ISOLATION OF GENOMIC DNA FLANKING THE T-DNA SITE OF 
INSERTION TN ARABIDQPSTS THALIANA MUTANT LINE 658 
Identificat iion of an Arabidopsis thaliana 
5 T-PNA Mutant-: wihh High Oleic Acid Contient: 

A population of Arabidopsis thaliana (geographic 
race Wassilewski ja) transformants containing the 
modified T-DNA of Agrobacterium tiiimef aciens was 

generated by seed transformation as described by 

10 Feldmann et al.r (Mol. Gen. Genetics (1987) 208:1-9) • 
In this population the transformants contain DNA 
sequences encoding the pBR322 bacterial vector, nopaline 
synthase, neomycin phosphotransferase (NPTII, confers 
kanamycin resistance) , and b-lactamase (confers 

15 ampicillin resistance) within the T-DNA border 

sequences. The integration of the T-DNA into different 
areas of the chromosomes of individual transformants may 
cause a disruption of plant gene function at or near the 
site of Insertion, and phenotypes associated with this 

20 loss of gene function can be analyzed by screening the 
population for the phenotype. 

T3 seed was generated from the wild type seed 
treated with Agrobactierium tumefflClens by two rounds of 
self-fertilization as described by Feldmann et al., 

25 (Science (1989) 243:1351-1354). These progeny were 

segregating for the T-DNA insertion, and thus for any 
mutation resulting from the insertion. Approximately 
10-12 leaves of each of 1700 lines were combined and the 
fatty acid content of each of the 1700 pooled samples 

30 was determined by gas chromatography of the fatty acyl 
methyl esters essentially as described by Browse et al.., 
(Anal. Blochem. (1986) 152:141-145) except that 2.5% 
H2SO4 In methanol was used as the methylation reagent. 
A line designated "658" produced a sample that gave an 
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alt red fatty acid profile con^ared to those of lines 
657 and 659 san^led at the same time (Table 8) . 

TABLE 8 



Fatty Acid 
Methvl Ester 


657 Leaf 
Pool 


659 Leaf 
Pool 


658 Leaf 
Pool 


16:0 


14.4 


14.1 


13.6 


16:1 


4.4 


4.6 


4.5 


16:2 


2.9 


2.2 


2.7 


16:3 


13.9 


13.3 


13.9 


18:0 


1.0 


1.1 


0.9 


18:1 


2.6 


2.5 


4.9 


18:2 


14.0 


13.6 • 


12.8 


18:3 


42.9 


46.1 


44.4 



Analysis of the fatty acid composition of 12 
individual T3 seeds of line 658 indicated that the 658 
5 pool was composed of seeds segregating in three classes: 
. "high", "mid-range" and "low" classes with 

approximately/ .37% (12 seeds), 21% (7 seeds) , and 14% 
(3 seeds) oleic acid, respectively (Table 9) . 

TABLE 9 





. "High" 

Class 


"Mid- range" 

Class 


"Low" 

Class 


16:0 


8.9 


8.7 


9.3 


16:1c 


2.0 


1.6 


2.6 


18:0 


4.5 


4.3 


4.4 


18:1 


37.0 


20.7 


14.4 


18:2 


• 8.0 


24.9 


27 .7 


18:3 


10.6 


14.3 


13.6 


20:1 


25.5 


21.6 


20.4 



Thus, the high oleic acid mutant phenotype 
10 segregates in an approximately Mendelian ratio. To 

determine the number of independently segregating T-DNA 
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inserts in line 658, 200 T3 seeds were tested for their 
ability to germinate and grow in the presence of 
kanamycin [Felditian et al, (1989) Science 243:1351-1354]. 
In this experiment, only 4 kanamycin-sensitive 
5 individual plants were identified. The segregation 
ratio (approximately 50:1) indicated that line 658 
harbored three T-DNA Inserts. In this and two other 
experiments a total of 56 kanamycin-sensitive plants 
were identified; 53 of these were analyzed for fatty 
10 acid coit^osition and at least seven of these displayed 
oleic acid levels that were higher than would be 
expected for wild type seedlings grown under these 
conditions. 

In order to more rigorously test whether the 

15 mutation resulting in high oleic acid is the result of 

T-DNA insertion. Applicants identified a derivative line 
that was segregating for the mutant fatty acid phenotype 
as well as a single kanamycin resistance locus. For 
this, approximately 100 T3 plants were individually 

20 grown to maturity and seeds collected. One sample of 
seed from each T3 plant was tested for the ability to 
germinate and grow in the presence of kanamycin. In 
addition, the fatty acid compositions of ten additional 
Individual seeds from each line were determined. A T3 

25 plant, designated 658-75, was identified whose progeny 

seeds segregated 28 kanamycin-sensitive to 60 kanamycin- 
resistant and 7 with either low or intermediate oleic 
acid to 2 high oleic acid. 

A total of approximately 400 T4 progeny seeds of 

30 the derivative line 658-75 were grown and the leaf fatty 
acid composition analyzed. A total of 91 plants were 
identified as being homozygous for the high oleic acid 
trait (18:2/18:1 less than 0.5). The remaining plants 
(18:2/18:1 more than 1.2) could not be definitively 

35 assigned to wild type and heterozygous classes on the 
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basis of leaf fatty acid composition and thus could not 
be used to test linkage between the kanamycin marker and 
the fatty acid mutation. Eighty three of the 91 
apparently homozygous high oleic acid mutant were tested 
5 for the presence of nopaline/ another T-DNA marker, in 
leaf extracts (Erran?>alli et al. The Plant Cell 
(1991)3:149-157 and all 83 plants were positive for the 
presence of nopaline . This tight linkage of the mutant 
fatty acid phenotype and a T-DNA marker provides 

10 evidence that the high oleic acid trait in mutant 658 is 
the result of T-DNA insertion. 

PlasmicI Rescufi anri Analysis 
One-half and one microgram of genomic DNA from the 
homozygous mutant plants of the 658-75 line, prepared 

15 from leaf tissue as described (Rogers, S. O. and A- J, 
Bendich (1985) Plant Molecular Biology 5:69-76], was 
digested with 20 units of either Bam HI or Sal I 
restriction enzyme (Bethesda Research Laboratory) in a 
50 JIL reaction volume according to the manufacturer's 

20 specifications- After digestion the DNA was extracted 
with buffer-saturated phenol (Bethesda Research 
Laboratory) followed by precipitation in ethanol. 
One-half to one microgram of Bam HI or Sal I digested 
genomic DNA was resuspended in 200 uL or 400 uL of 

25 ligation buffer containing 50 mM Tris-Cl, pH 8.0, 10 mM 
MgCl2r 10 mM dithiothreitol, 1 mM ATP, and 4 units of T4 
DNA ligase (Bethesda Research Laboratory) . The dilute 
DNA concentration of approximate 2.5 ug/mL in the 
ligation reaction was chosen to facilitate 

30 circularization, as opposed to intermolecular joining. 
The reaction was incubated for 16 h at 16*^C. Conqpetent 
DHIOB cells (Bethesda Research Laboratory) were 
transfected with 10 ng of ligated DNA per 100 JIL of 
competent cells according to the manufacturer's 

35 specifications. Transf ormants from Sal I or Bam HI 
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digests were selected on LB plates (10 g Bacto-tryptone, 
5 g Bacto-yeast extract^ 5 g NaCl, 15 g agar per liter^ 
pH 7.4) containing 100 |ig/mL ampicillin. After overnight 
incubation at 37®C the plates were scored for 
5 ampicillin-resistant colonies. 

A single ampicillin-resistant transformant derived 
from Bam Hl-digested plant DNA was used to start a 
culture in 35 mL LB medium (10 g Bacto-tryptone, 5 g 
yeast-extract r 5 g NaCl per liter) containing 25 mg/L 
10 ampicillin. The culture was incubated with shaking 

overnight at 37^C and the cells were then collected by 
centrifugation at lOOOxg for 10 min. Plasmid DNA, 
designated p658-l, was isolated from the cells by the 
alkaline lysis method of Birmbiom et al. [Nucleic Acid 
15 Research (1979) 7:1513-1523], as described in Sambrook 
et al., (Molecular Cloning, A Leiboratory Manual, 2nd ed 
(1989) Cold Spring Harbor LsJ3oratory Press) . Plasmid 
p658-l DNA was digested by restriction enzymes Bam HI, 
Eco RI and Sal I (Bethesda Reseach Laboratory) and 
20 electrophoresed through a 1% agarose gel in IxTBE buffer 
(0.089M tris-borate, 0.002M EDTA) . The restriction 
pattern indicated the presence in this plasmid of the 
expected 14.2 kB T-DNA fragment and a 1.6 kB putative 
plant DNA/T-DNA border fragment. 
25 EXAMPLE 2 

CLONING OF ARABTDQPSTS THALIANA MICROSOMAL DELTA-12 
DESATURASE ePNA US ING GENOMIC DNA FLANKING THE 
T->DNA SITE OF INS ERTION IN ARABIDOPSIS THALIANA 
MDTANT LI NE 658-75 AS A HYBRIDIZATION PROBE 

30 Two hundred nanograms of the 1.6 kB Eco RI-Bem HI 

fragment from plasmid p658-l, following digestion of the 
plasmid with Eco RI and Bam HI and purification by 
electrophoresis in agarose, was radioleJ^elled with 
alpha [32P]-dCTP using a Random Priming Labeling Kit 
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(Bethesda Research Laboratory) under conditions 
reconunended by the manufacturer. 

The radiolabeled DNA was used as a probe to screen 
an Arabidopsis cDNA library made from RNA Isolated from 
5 above ground portions of various growth stages (Elledge 
et al.r (1991) Proc. Nat. Acad. Sci.^ 88:1731-1735) 
essentially as described in Sambrook et al., (Molecular 
Cloningr A Laboratory Manual, 2nd ed. (1989) , Cold 
Spring Harbor Laboratory Press) . For this, 

10 approximately 17,000 plaque-forming units were plated on 
seven 90mm petri plates containing a lawn of LE392 
E. coli cells on NZY agar media (5 g NaCl, 2 g MgS04-7 
H20, 5 g yeast extract, 10 g casein acid hydrolysate, 
13 g agar per liter) • Replica filters of the phage 

15 plaques were prepared by adsorbing the plaques onto 
nitrocellulose filters (BA85, Schleicher and Schuell) 
then soaking successively for five min each in 0.5 M 
NaOH/1 M NaCl, 0.5 M Tris (pH 7.4) /I. 5 M NaCl and 2xSSPE 
(0.36 M NaCl, 20 mM NaH2P04 (p H7.4), 20 mM EDTA 

20 (pH 7.4)) . The filters were then air dried and baked 
for 2 h at 80®C- After baking the filters were wetted 
in 2X SSPE, and then incubated at 42**C in 
prehybridization buffer (50% Formamide, 5X SSPE, 1% SDS, 
5X Denhardt*s Reagent, and 100 ug/mL denatured salmon 

25 sperm DNA) for 2 h. The filters were removed from the 
prehybridization buffer, and then transferred to 
hybridization buffer (50% Formamide, 5X SSPE, 1% SDS, IX 
Denhardt's Reagent, and 100 ug/mL denatured salmon sperm 
DNA) containing the denatured radiolabeled probe (see 

30 above) and incubated for 40 h at 42*C. The filters were 
washed three times in 2X SSPE/ 0.2% SDS at 42**C (15 min 
each) and twice in 0.2X SSPE/0r2% SDS at 55^*0 (30 min 
each) , followed by autoradiography on Kodak XAR-5 film 
with an intensifying screen at -SO^'C, overnight. 

35 Fifteen plaques were identified as positively- 
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hybridizing on replica filters. Five of these were 
subjected to plaque purification essentially a.s 
described in Sambrook et al., (Molecular Cloning, A 
Laboratory Manual, 2nd ed. (1989), Cold Spring Harbor 
5 Laboratory Press) . The lambda YES-R cDNA clones were 
converted to plasmid by propagating the phage in the 
R. coli BNN — 132 cells, which expresses Cre protein that 
excises the cDNA insert as a double-stranded plasmid by 
cre-mediated in vivo site-speicif c recombination at a 

10 *lox' sites present in the phage. Ampicillin-resistant 
plasmid clones containing cDNA inserts were grown in 
liquid culture, and plasmid DNA was prepared using the 
alkaline lysis method as previously described. ' The 
sizes of the resulting plasmids were analyzed by 

15 electrophoresis in agarose gels. The agarose gels were 
treated with 0.5 M NaOH/1 M NaCl, and 0.5 M 
Tris(pH 7.4), 1.5 M NaCl for 15 min each, and the gel 
was then dried completely on a gel drier at 65**C. The 
gel was hydrated in 2X SSPE and incubated overnight, at 

20 42*'C, in hybridization buffer containing the denatured 
radiolabeled probe, followed by washing as described 
alDove. After autoradiography, the inserts of four of 
the purified cDNA clones were found to have hybridized 
to the probe. Plasmid DNA from the hybridizing clones 

25 was purified by equilibration in a CsCl/ethidium bromide 
gradient (see above) • The four cDNA clones were 
sequenced using Sequenase T7 DNA polymerase (US 
Biochemical Corp.) and the manufacturer's instructions, 
beginning with primers homologous to vector sequences 

30 that flank the cDNA insert. After comparing the partial 
sequences of the inserts obtained from the four clones, 
it was apparent that they each contained sec[uences in 
common. One cDNA clone, p92103, containing ca. 1.4 kB 
CDNA insert, was sequenced. The longest three clones 

35 were subcloned into the plasmid vector pBluescript 
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(Stratagene) . One of these clones, designated pSF2b, 
containing ca 1.2 kB cDNA insert was also sequenced 
serially with primers designed from the newly acquired 
sequences as the sequencing experiment progressed. The 
5 composite sequence derived from pSF2b and p92103 is 
shown in SEQ ID N0:1. 

EXAMPLE 3 
rT.nNTNG OF PLANT FATTY ACID 
DESATURASE cPNAs U SING THE ARABIDQPSIS THALIANA 
10 MTrRQSOMA L DELTA->12 DESATURASE cDNA CLONE AS A 

HYBRTDIZATTON PROBE 

An approximately 1.2 kb fragment containing the 
Arabidopsis delta-12 desaturase coding sequence of SEQ 
ID N0:1 was obtained from plasmid pSF2b. This plasmid 
15 was digested with EcoR I and the 1,2 kb delta-12 

desaturase cDNA fragment was purified from the vector 
sequence by agarose gel electrophoresis. The fragment 
was radiolabelled with 32p ^s previously described. 
Cloning a Brassica napus Seed 
20 CDNA Enc oding Microsomal Delta-12 Fattv Acid 

Desaturase 

The radiolabelled probe was used to screen a 
Sraaalca napus seed cDNA library. In order to construct 
the library, Brassica napus seeds were harvested 20-21 

25 days after pollination, placed in licjuid nitrogen, and 
polysomal RNA was isolated following the procedure of 
Kamalay et al., (Cell (1980) 19:935-946). The 
polyadenylated mRNA fraction was obtained by affinity 
chromatography on oligo-dT cellulose (Aviv et al . , Proc . 

30 Natl. Acad. Sci. USA (1972) 69:1408-1411). Four 

micrograms of this mRNA were used to construct a seed 
cDNA library in lambda phage (Uni-Z7kP™ XR vector) using 
the protocol described in the ZAP-cDNA™ Synthesis Kit 
(1991 Stratagene Catalog, Item #200400). Approximately 

35 600,000 clones were screened for positively hybridizing 
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plaques using the radiolabelled EcoR I fragm nt from 
pSF2b as a probe essentially as described in Sambrook et 
al., (Molecular Cloning: A Laboratory Manual, 2nd ed. 
(1989) Cold Spring Harbor Laboratory Press) except that 
5 low stringency hybridization conditions (50 xnM Tris, pH 
7.6, 6X SSC, 5X Denhardt*s, 0.5% SDS, 100 ^g denatured 
calf thymus DNA and 50^C) were used and post- 
hybridization washes were performed twice with 2X SSC, 
0.5% SDS at room temperature for 15 min, then twice with 

10 0.2X SSC, 0.5% SDS at room tenperature for 15 min, and 
then twice with 0.2X SSC, 0.5% SDS at 50**C for 15 min. 
Ten positive plaques showing strong hybridization were 
picked, plated out, and the .screening procedure was 
repeated. From the secondary screen nine pure phage 

15 plaques were isolated. Plasmid clones containing the 
cDNA inserts were obtained through the use of a helper 
phage according to the ±xi vivo excision protocol 
provided by Stratagene. Double-stranded DNA was 
prepared using the alkaline lysis method as previously 

20 described, and the resulting plasmids were size-analyzed 
by electrophoresis in agarose gels. The largest one of 
the nine clones, designated pCF2-165D, contained an 
approximately 1.5 kb insert which was sequenced as 
described above. The sequence of 1394 bases of the cDNA 

25 insert of pCF2-165D is shown in SEQ ID NO:3. Contained 
in the insert but not shown in SEG ID NO: 3 are 
approximately 40 bases of the extreme 5* end of the 5' 
non-translated region and a poly A tail of about 38 
bases at the extreme 3' end of the insert. 

30 Cloning of a Soybean Seed 

cPNA Encoding Microsomal Pelta-12 

Fatty Acid Desaturase 
A cDNA library was made as follows: Soybean 
embryos (ca. 50 mg fresh weight each) were removed from 
35 the pods and frozen in liquid nitrogen. The frozen 
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embryos were ground to a fine powder in the presence of 
liquid nitrogen and then extracted by Polytron 
homogenization and fractionated to enrich for total RNA 
by the method of Chirgwin et al. (Biochemistry (1979) 
5 18:5294-5299) . The nucleic acid fraction was enriched 
for poly A+RNA by passing total RNA through an oligo-dT 
cellulose column and eluting the poly A+RNA with salt as 
described by Goodman et al, (Meth. Enzymol. (1979) 
68:75-90) . cDNA was synthesized from the purified poly 

10 A"*"RNA using cDNA Synthesis System (Bethesda Research 
Laboratory) and the manufacturer's instructions* The 
resultant double-stranded DNA was methylated by Eco RI 
DNA methylase (Promega) prior to filling-in its ends 
with T4 DNA polymerase (Bethesda Research Laboratory) 

15 and blunt-end ligation to phosphorylated Eco RI linkers 
using T4 DNA ligase (Pharmacia) . The double-stranded 
DNA was digested with Eco RI enzyme, separated from 
excess linkers by passage through a gel filtration 
column (Sepharose CL-4B) , and ligated to lambda 

20 vector (Stratagene) according to manufacturer's 

instructions. Ligated DNA was packaged into phage using 
the Gigapack packaging extract (Stratagene) according to 
manufacturer's instructions. The resultant cDNA library 
was amplified as per Stratagene 's instructions and 

25 stored at -80''C. 

Following the instructions in the Lambda ZAP 
Cloning Kit Manual (Stratagene) , the cDNA phage library 
was used to infect £. coli BB4 cells and approximately 
600,000 plaque forming units were plated onto 150 mm 

30 diameter petri plates. Duplicate lifts of the plates 
were made onto nitrocellulose filters (Schleicher & 
Schuell) . The filters were prehybridized in 25 mL of 
hybridization buffer consisting of 6X SSPE, 5X 
Denhardt's solution, 0.5% SDS, 5% dextran sulfate and 

35 0-1 mg/mL denatured salmon sperm DNA (Sigma Chemical 
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Co.) at 50**C for 2 h. Radiolabelled probe prepared from 
pSF2b as described above was added, and allowed to 
hybridize for 18 h at 50*^C. The filters were washed 
exactly as described above. Autoradiography of the 
5 filters indicated that there were 14 strongly 

hybridizing plaques. The 14 plaques were subjected to a 
second round of screening as before. Numerous, strongly 
hybridizing plaques were observed on 6 of the 14 
filters, and one, well-isolated from other phage, was 

10 picked from each of the six plates for further analysis. 

Following the Lambda ZAP Cloning Kit Instruction 
Manual (Stratagene) , sequences of the pBluescript 
vector, including the cDNA inserts, from the purified 
phages were excised in the presence of a helper phage 

15 and the resultant phagemids were used to infect £. coli 
XL-1 Blue cells. DNA from the plasmids was made by the 
Promega "Magic Miniprep" according to the manufacturers 
instructions. Restriction analysis indicated that the 
plasmids contained inserts ranging in size from 1 kb to 

20 2.5 kb. The alkali-denatured double-stranded DNA from 
one of these, designated pSF2-169K contained an insert 
of 1.6 kb, was sequenced as described above. The 
nucleotide sequence of the cDNA insert in plasmid 
PSF2-169K shown in S£Q ID NO: 5. 

25 Cloning of a Corn (Zea mays) 

cDNA Encoding Seed Microsomal Delta-12 
Fatty Acid Desaturase 
Corn microsomal delta-12 desaturase cDNA was 
isolated using a PGR approach. For this, a cDNA library 
30 was made to poly A"^ RNA from developing corn embryos in 
Lambda ZAP II vector (Stratagene) . 5-10 ul of this 
library was used as a template for PGR using 100 pmol 
each of two sets of degenerate oligomers NS3 (SEQ ID 
NO: 13) and equimolar amounts of RB5a/b (that is, 
35 equimolar amounts of SEQ ID NOS: 16/17) as sense and 
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antlsense primers, respectively. NS3 and RB5a/b 
correspond to stretches of amino acids 101-109 and 
318-326, respectively, of SEQ ID N0:2, which are 
conserved in most microsomal delta-12 desaturases (SEQ 
5 ID N0S:2, 4, 6, 8) . PCR was carried out using the PGR 
kit (Perkin-Elmer) using 40 cycles of 94**C 1 min, 45**C, 
1 minr and SS^'C, 2 min. Analyses of the PCR products on 
an agarose gel showed the presence of a product of the 
expected size (720 bp), which was absent in control 

10 reactions containing either the sense or antisense 
primers alone. The PCR product fragment was gel 
purified and then used as a probe for screening the same 
corn cDNA library at 60®C as described «d30ve. One 
positively-hybridizing plaque was purified and partial 

15 sequence determination of its cDNA showed it to be a 
nucleotide sequence encoding microsomal delta-12 
desaturase but truncated at the 3' end. The cDNA insert 
encoding the partial desaturase was gel isolated and 
used to probe the corn cDNA library again. Several 

20 positive plaques were recovered and characterized. DNA 
sequence analysis revealed that all of these clones seem 
to represent the same sequence with the different length 
of 5' or 3 • ends. The clone containing the longest 
insert, designated pFad2#l, was sequenced completely. 

25 SEQ ID NO: 7 shows the 5' to 3« nucleotide sequence of 
1790 base pairs of corn f Zea xnay&) cDNA which encodes 
microsomal delta-12 desaturase in plasmid pFad2#l. 
Nucleotides 165 to 167 and nucleotides 1326 to 1328 are, 
respectively, the putative initiation codon and the 

30 termination codon of the open reading frame (nucleotides 
164 to 1328) . SEQ ID N0:8 is the 387 amino acid protein 
sequence deduced from the open reading frame 
(nucleotides 164 to 1328) in SEQ ID NO:7. The deduced 
amino acid sequence of the polypeptide shared overall 

35 identities of 71%, 40%, and 38% to Arabidopsis 
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microsomal delt:a-12 deaatmrasef A^^abidopsis microsomal 
delta-15 desaturase, and Arabidopsis plastid delta-15 
desaturase, respectively. Furthermore^ it lacked an 
N-terminal amino acid extension that would indicate it 
5 is a plastid enzyme. Based on these considerations, it 
is concluded that it encodes a microsomal delta-12 
desaturase • 

Cloning of a cDN A Encoding A Microsomal Delta-12 
Desatura fi^ and of ePNAs Encoding Microaomal Delta-12 

10 Desattirafi^-Related Enzvmes from Castor Bean Seed 

Castor microsomal delta-12 desaturase cDNA was 
isolated using a RT-PCR approach. Polysomal mRNA was 
isolated from castor beans of stages I-II (5-10 DAP) and 
also from castor beans of stages IV-V (20-25 DAP) , 

15 Ten ng of each mRNA was used for separate RT-PCR 

reactions, using the Perkin-Elmer RT-PCR kit with the 
reagent concentration as recommended by the kit 
protocol. The reverse transcriptase reaction was primed 
with random hexamers and the PCR reaction with 100 pmol 

20 each of the degenerate delta-12 desaturase primers NS3 
and NS9 (SEQ ID N0S:13 and 14, respectively) . The 
reverse transcriptase reaction was incubated at 25**C for 
10 min, 42*C for 15 min, SS^'C for 5 min and S'^C for 
5 min. The PCR reaction was incubated at 95**C for 2 min 

25 followed by 35 cycles of 95**C for 1 min/50'*C for 1 min. 
A final incubation at SO'^C for 7 min completed the 
reaction. A DNA fragment of 720 bp was amplified from 
both stage I-II and stage IV-V mRNA. The amplified DNA 
fragment from one of the reactions was gel purified and 

30 cloned into a pGEM-T vector using the Promega pGEM-T PCR 
cloning kit to create the plasmid pRF2-lC. The 720 bp 
insert in pRF2-lC was sequenced, as described above, and 
the resulting DNA sequence is shown in SEQ ID NO: 9. The 
DNA sequence in SEQ ID NO: 9 contains an open-reading 

35 frame encoding 219 amino acids (SEQ ID NO: 10), which has 
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81% identity (90% similarity) with amino acids 135 to 
353 of the Arahidopsis microsomal delta-12 desaturase 
described in SEQ ID NO: 2. The cDNA insert in pRF2-lC is 
therefore a 673 bp fragment of a full-length cDNA 
5 encoding a castor bean seed microsomal delta-12 

desaturase. The full length castor bean seed microsomal 
delta-12 desaturase cDNA may isolated by screening a 
castor seed cDNA library, at 60**C, with the labeled 
insert of pRF2-lC as described in the example above. 
10 The insert in pRF2-lC may also be used to screen castor 
bean libraries at lower temperatures to isolate delta-12 
desaturase related sequences , such as the delta-12 
hydroxylase. 

A cDNA library made to poly A+ mRNA isolated from 

15 developing castor beans (stages IV-V, 20-25 DAP) was 
screened as described above. RadiolsUt>eled probe 
prepared from pSF2b or pRF2-lC, as described above, were 
added, and allowed to hybridize for 18 h at 50**C. The 
filters were washed as described above. Autoradiography 

20 of the filters indicated that there were numerous 

hybridizing plaques, which appeared either strongly 
hybridising or weakly hybridising. Three of the 
strongly hybridisng plaques (190A-41, 190A-42 and 
190A-44) and three of the weakly hybridising plaques, 

25 (190B-41, 190b-43 and 197c-42) , were plaque purified 

using the methods described above. The cDNA insert size 
of the purified phages were determined by PGR 
amplication of the insert using phage as template and 
lambda-gtll oligomers (Clontech lambda-gtll Amplimers) 

30 for primers. The PCR-amplif ied inserts of the amplified 
phages were subcloned into pBluescript (Pharmacia) which 
had been cut with Eco RI and filled in with Klenow 
(Sambrook et al. (Molecular Cloning, A Laboratory 
Approach, 2nd. ed. (1989) Cold Spring Harbor Laboratory 

35 Press) . The resulting plasmids were called pRF190a-41, 
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pRF190a-42, pRF190a-44, pRF19bb-41, pRF190b-43 and 
pRF197c-42. All of the inserts were about 1.1 kb with 
the exception of pRF197c-42 which was approx. 1.5 kb. 
The inserts in the plasmids were sequenced as described 
5 above. The insert in pRF190b-43 did not contain any 
open reading frame and was not identified. The inserts 
in pRF190a-41, pRFl90a-42, pRF190a-"44 and pRF190b-41 
were identical. The insert in pRF197c-42 contained all 
of the nucleotides of the inserts in pRF190a-41, 

10 pRFl?0a-42, pRF190a-44 and pRF190b-41 plus an additional 
approx. 400 bp. It was deduced therefore that the 
insert in pRF197c-42 was a longer version of the inserts 
in pRF190a-41, .pRF190a-42, pRF190a-44 and pRF190b-41 and 
all were derived from the same full-length raRNA. The 

15 complete cDNA sequence of the insert in plasmid 

pRF197c-42 is shown in SEQ ID NO: 11. The deduced amino 
acid sequence of SEQ ID NO: 11, shown in SEQ ID NO: 12, is 
78.5% identical (90% similarity) to the castor 
microsomal delta-12 desaturase described above (SEQ ID 

20 NO: 10) and 66% identical (80% similarity) to the 

Arabidopsis delta-12 desaturase amino acid sequence in 
SEQ ID N0:2. These similarities confirm that pRF197c-42 
is a castor bean seed cDNA that, encodes a microsomal 
delta-12 desaturase or a microsomal delta-12 desaturase- 

25 related enzyme, such as a delta-12 hydroxylase. 

Specific PGR primers for pRF2-lC and pRF197c-42 were 
made. For pRF2-lc the upstream primer was bases 180 to 
197 of the CDNA sequence in SEQ ID NO: 9. For pRF197c-42 
the upstream primer was bases 717 to 743 of the cDNA 

30 sequence in SEQ ID NO: 11. A common downstream primer 
was made corresponding to the exact complement of the 
nucleotides 463 to 478 of the sequence described in SEQ 
ID NO: 9. Using RT-PCR with random hexamers and the 
above primers, and the incubation tenperatures described 

35 above, it was observed that mRNA which gave rise to the 
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cDNA contained in pRF2-lC is present in both Stage I-II 
and Stage IV-V castor bean seeds whereas mRNA which gave 
rise to the cDNA contained in plasmid pRF197c-42 is 
present only in Stage IV-V castor bean seeds, i.e., it 
5 is only expressed in tissue actively synthesizing 
ricinoleic acid. Thus it is possible that this cDNA 
encodes a delta-12 hydroxylase. 

Clones such as pRF2-lC and pRF197c-42, and other 
clones from the differential screening, which, based on 

10 their DNA sequence, are less related to castor bean seed 
microsomal delta-12 desaturases and are not any of the 
known fatty-acid desaturases described eO^ove or in 
WO 9311245, may be expressed, for example, in soybean 
embryos of another suitable plant tissue, or in a 

15 microorganism, such as yeast, which does not normally 
contain ricinoleic acid, using suitable expression 
vectors and transformation protocols. The presence of 
novel ricinoleic acid in the transformed tissue (s) 
expressing the castor cDNA would confirm the identity of 

20 the castor cDNA as DNA encoding for an oleate 
hydroxylase . 

EXAMPLE 4 

nSE QF THE ARABTDOPSTS THAI.TANA DELTA-12 DESATtJRASE 
GENQMTC CI^QKrE AS A RESTRTCTTQN FRAGMENT LENGTH 
25 POLYMORPHISM rRFLP) MARKER TO MAP THE DELTA- 12 

DESATURASE LOCUS I N ARABIDOPSIS 

The gene encoding Arabidopsia microsomal delta-12 
desaturase was used to map the genetic locus encoding 
the microsomal delta-12 desaturase of Arabidopgia 

30 thallana. pSF2b cDNA insert encoding ArabidPpsAs 

microsomal delta-12 desaturase DNA was radiol2d:>eled and 
used to screen an Arabidopsis genomic DNA library. DNA 
from several pure strongly-hybridizing phages was 
isolated. Southern blot analysis of the DNA from 

35 different phages using radiolabeled pSF2b cDNA insert as 
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the probe Identified a 6 kb Hind III insert fragment to 
contain the coding region of the gene. This fragment 
was subcloned in pBluescript vector to result in plasmid 
pAGF2-6 and used for partial sequence determination. 
5 This sequence (SEQ ID NO: 15) confirmed that it is the 
microsomal delta-12 desaturase gene. DNA from two 
phages was isolated and labelled with 32p using a random 
priming kit from Pharmacia under conditions recommended 
by the manufacturer. The radioactive DNA was used to 

10 probe a Southern blot containing genomic DNA from 

ArahidQpsis tha liana (ecotype Wassileskija and marker 
line WlOO .ecotype Landesberg background) digested with 
one of several restriction endonucleases . Following 
hybridization and washes under standard conditions 

15 (Sambrook et al.f Molecular Cloning: A Laboratory 

Manual^ 2nd ed. (1989) Cold Spring Harbor Laboratory 
Press) r autoradiograms were obtained. A different 
pattern of hybridization (polymorphism) was identified 
in Hind Ill-digested genomic DNAs using one of the phage 

20 DNAs.. This polymorphism was located to a 7 kB Hind III 
fragment in the phage DNA that revealed the 
polymorphism. The 7 kb fragment was sxibcloned in 
pBluescript vector to result in plasmid pAGF2-7. 
Plasmid pAGF2-'7 was restricted with Hind III enzyme and 

25 used as a radiolabelled probe to map the polymorphism 
essentially as described by Helentjaris et al., (Theor. 
Appl. Genet. (1986) 72:761-769). The radiolabelled DNA 
fragment was applied as described above to Southern 
blots of Hind Ill-digested genomic DNA isolated from 117 

30 recombinant inbred progeny (derived from single-seed 
descent lines to the Fe generation) resulting from a 
cross between Arabtdopsls t-haliana marker line WlOO and 
ecotype Wassileskija (Burr et al.. Genetics (1988) 
118:519-526) . The bands on the autoradiograms were 

35 interpreted as resulting from inheritance of either 
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pat rnal (ecotype Wassileski ja) or maternal (marker line 
WlOO) DNA or both (a heterozygote) . The resulting 
segregation data were subjected to genetic analysis 
using the computer program Mapmaker (Lander et al., 
5 Genomics (1987) 1:174-181) • In conjunction with 

previously obtained segregation data for 63 anonymous 
RFLP markers and 9 morphological markers in Arabldopsis 
thaliana (Chang et al., Proc. Natl. Acad. Sci. USA 
(1988) 85:6856-6860; Nam et al.. Plant Cell (1989) 

10 1:699-705), a single genetic locus was positioned 

corresponding to the microsomal delta- 12 desaturase 
• gene. The location of the microsomal delta-12 
desaturase gene was thus determined t:o be 13.6'cM 
proximal to locus c3838, 9.2 cM distal to locus lAt228, 

15 and 4.9 cM proximal to FadD locus on chromosome 3 
[Koorneeff M. et. al. (1993) in Genetic Maps, Ed. 
O'Brien, S. J.; Yadav et al. (1993) Plant Physiology 
11131:467-476.3 

EXAMPLE 5 

20 nSE OF SOYBEAN MICROSOMAL DELTA-12 DESATURASE cDNA 

SEQUENCE AS A RESTRICTION FRAGMENT 
LENGTH POLYMORPHISM (RFLP) MARKER 

The 1.6 kb insert obtained £roia the plasmid 
pSF2-169K as previously described was radlolabelled with 

25 using a Random Priming Kit from Bethesda Research 

Laboratories under conditions recommended by the 
manufacturer* The resulting radioactive probe was used 
to probe a Southern blot (Sambrook et al.^ Molecular 
Cloning: A Laboratory Manual^ 2nd Ed. (1989) Cold 

30 Spring Harbor Laboratory Press) containing genomic DNA 
from soybean f Glyeine max (cultivar Bonus) and Glycine 
soja (PI81762)) digested with one of several restriction 
enzymes. After hybridization and washes under low 
stringency conditions (50 mM Tris, pH 7.5^ 6X SSPE^ 10% 

35 dextran sulfate, 1% SDS at 56*C for the hybridization 
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and initial washes, changing to 2X SSPE and 0.1% SDS for 
the final wash) , aiatoradiograms were obtained, and 
different patterns of hybridization (polymorphisms) were 
identified in digests performed with restriction enzymes 
5 Hind III and Eco RI. These polymorphisms were used to 
map two pSF2-169k loci relative to other loci on the 
soybean genome essentially as described by Helent jaris 
et al.r (Theor. Appl, Genet. (1986) 72:761-769). The. 
map positions of the polymorphisms were determined to be 

10 in linkage group 11 between 4404.00 and 1503.00 loci 
(4.5 CM and 7.1 cM from 4404.00 and 1503.00, 
respectively) and linkage group 19 between 4010.00 and 
5302.00 loci (1.9 CM and 2.7 cM from 4010.00 and 
5302.00, respectively) [Rafalski, A. and Tingey, S. 

15 (1993) in Genetic Maps, Ed. O' Brien, S. J.]. 

EXAMPLE 6 

EXPRESfiTQN OF MTCROSQMAI. DEI.TA-12 DESATURASE IN SOYBEANS 
Construction of Vectors for Transformation of 
Glycine max for Reduced Expression of 
20 Microsomal Delta-12 Desaturases in 

pevelopina Soybean Seeds 
Plasmids containing the antisense maz microsomal 
delta-12 desaturase cDNA sequence under control of the 
soybean Kunitz Trypsin Inhibitor 3 (KTi3) promoter 
25 (Jofuku and Goldberg, Plant Cell (1989) 1:1079-1093) , 
the Phaseolus vulgaris 75 seed storage protein 
(phaseolin) promoter (Sengupta-Gopalan et al . > Proc. 
Natl. Acad. Sci. USA (1985) 82:3320-3324; Hoffman et 
al.. Plant Mol. Biol. (1988) 11:717-729) and soybean 
30 beta-conglycinin promoter (Beachy et al., EMBO J. (1985) 
4:3047-3053), were constructed. The construction of 
vectors expressing the soybean delta-12 desaturase 
antisense cDNA under the control of these promoters was 
facilitated by the use of the . following plasmids: 
35 pML70, pCW108 and pCW109A. 
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The pML70 vector contains the KTi3 promoter and the 
KTi3 3* untranslated region and was derived from the 
commercially available vector pTZlBR (Pharmacia) via the 
intermediate plasmids pMLSl, pML55, pML64 and pML65. A 
5 2.4 kb Bst BI/Eco RI fragment of the complete soybean 
KT13 gene (Jofuku and Goldberg (1989) Plant Cell 
1:1079-1093), which contains all 2039 nucleotides of the 
5' untranslated region and 390 bases of the coding 
sequence of the KTi3 gene ending at the Eco RI site 

10 corresponding to bases 755 to 761 of the sequence 

described in Jofuku et al (1989) Plant Cell 1:427-435, 
was ligated into the Acc I/Eco RI sites of pTZ18R to 
create the plasmid pMLSl. The plasmid pMLSl was cut 
with Nco I, filled in using Klenow, and religated, to 

15 destroy an Nco I site in the middle of the 5* 

untranslated region of the KTi3 insert, resulting in the 
plasmid pML55. The plasmid pMLSS was partially digested 
with Xmn I/Eco RI to release a 0.42 kb fragment, 
corresponding to bases 732 to 755 of the above cited 

20 sequence, which was discarded. A synthetic Xmn I/Eco RI 
linker containing an Nco I site, was constructed by 
making a dimer of complementary synthetic oligo- 
nucleotides consisting of the coding sequence for. an Xmn 
I site (5'-TCTTCC-3 ') and an Nco I site ( 5 • -CCATGGG-3 ' ) 

25 followed directly by part of an Eco RI site 

(5'-GAAGG-3') . The Xmn I and Nco I/Eco RI sites were 
linked by a short intervening sequence 

(5'-ATAGCCCCCCAA-3') . This synthetic linker was ligated 
into the Xmn I/Eco RI sites of the 4.94 kb fragment to 

30 create the plasmid pML64 , The 3 ' untranslated region of 
the KTi3 gene was amplified from the sequence described 
in Jofuku et al (Ibid.) by standard PGR protocols 
(Perkin Elmer Cetus, GeneAmp PGR kit) using the primers 
ML51 and ML52 . Primer ML51 contained the 20 nucleotides 

35 corresponding to bases 1072 to 1091 of the above cited 
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sequence with the addition of nucl otides corresponding 
to ECO RV (5-'GATAtC-3») / Nco I (5 •-CCATGG-3 ' ) r Xba I 

(5 • -TCTAGA-3 * ) , Sma I (5 • -CCCGGG-3 ' ) and Kpn I 

(5»-GGTACC-3') sites at the 5' end of the primer, 
5 Primer ML52 contained to the exact compliment of the 
nucleotides corresponding to bases 1242 to 1259 of the 
sU30ve cited sequence with the addition of nucleotides 
corresponding to Sma I (5'-CCCGGG-3' ) ^ Eco RI 

(5'-GAATTC-3') # Bam HI (5 •-GGATCC-3 • ) and Sal I 
10 (5'-GTCGAC-3M sites at the 5' end of the primer. The 
PCR-amplif ied 3 • end of the KTi3 gene was ligated into 
the Nco I /Eco RI sites of pML64 to create the plasmid 
pML65. A synthetic multiple cloning site linker was 
constructed by making a dimer of complementary synthetic 
15 oligonucleotides consisting of the coding sequence for 
Pst I (5'-CTGCA-3»)^ Sal I (5 '-GTCGAC-S' ) r Bam HI 

(5«-GGATCC-3') and Pst I (5 '-CTGCA-3 • > sites. The 
linker was ligated into the Pst I site (directly 5' to 
the KTi3 promoter region) of pML65 to create the plasmid 
20 pML70. 

The 1.46 kb Sma I/Kpn I fragment from pSF2-169K 
(soybean delta- 12 desaturase cDNA described above) was 
ligated into the corresponding sites in pML70 resulting 
in the plasmid pBSlO. The desaturase cDNA fragment was 

25 in the reverse (antisense) orientation with respect to 
the KTi3 promoter in pBSlO. The plasmid pBSlO was 
digested with Bam HI and a 3.47 kb fragment^ 
representing the KTi3 promoter/antisense desaturase 
cDNA/KTi3-3* end transcriptional unit was isolated by 

30 agarose gel electrophoresis. The vector pMLlS consists 
of the non-tissue specific and constitutive cauliflower 
mosaic virus (35S) promoter (Odell et al.. Nature (1985) 
313:810-812; Hull et al.. Virology (1987) 86:482-493), 
driving expression of the neomycin phosphotransferase 

35 gene described in (Beck et al . (1982) Gene 19:327-336) 
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followed by the 3 ' end of the nopaline synthase gene 
including nucleotides 848 to 1550 described by (Depicker 
et al, (1982) J. Appl. Genet. 1:561-574). This 
transcriptional unit was inserted into the commercial 
5 cloning vector pGEM9Z (Gibco-BRL) and. is flanked at the 
5' end of the 35S promoter by the restriction sites 
Sal I, Xba I, Bam HI and Sma I in that order. An 
additional Sal I site is present at the 3* end of the 
NOS 3' sequence and the Xba I, Bam HI and Sal I sites 

10 are unique. The 3-47 kb transcriptional unit released 
from pBSlO was ligated into the Bam HI site of the 
vector pML18. When the resulting plasmids were doiable 
digested with Sma . I and Kpn I, plasmids containing 
inserts in the desired orientation yielded 3 fragments 

15 of 5.74, 2.69 and 1.4 6 kb. A plasmid with the 

transcriptional unit in the correct orientation was 
selected and was designated pBS13. 

The pCWlOS vector contains the bean phaseolin 
promoter and 3' untranslated region and was derived from 

20 the commercially available pUClB plasmid (Gibco-BRL) via 
plasmids AS3 and pCW104. Plasmid AS3 contains 495 base 
pairs of the bean (Phaseolus vulgaris) phaseolin (7S 
seed storage protein) promoter starting with 
5 • -TGGTCTTTTGGT-3 ' followed by the entire 1175 base 

25 pairs of the 3» untranslated region of the same gene 
(see sequence descriptions in Doyle et al.r (1986) 
J. Biol. Chem. 261:9228-9238 and Slightom et al.,(1983) 
Proc. Natl. Acad. Sci. USA, 80:1897-1901. Further 
sequence description may be found in WO 9113993) cloned 

30 into the Hind III site of pUClS. The additional cloning 
sites of the pUC18 multiple cloning region (Eco RI, 
Sph Ir Pst I and Sal I) were removed by digesting with 
Eco RI and Sal Ir filling in the ends with Klenow and 
religating to yield the plasmid pCW104 . A new multiple 

35 cloning site was created between the 4 95bp of the 5* 
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phaseolin and the" 1175bp of the 3' phaseolin by 
inserting a dimer of complementary synthetic oligo- 
nucleotides consisting of the coding sequence for a 
Nco I site (5 '-CCATGG-B ■ ) followed by three filler bases 
5 (5'-TAG-3')r the coding sequence for a Sma I site 

(5*-CCCGGG*-3*) # the last three bases of a Kpn I site 
(5'-TAC-3')# a cytosine and the coding sequence for an 
Xba I site (5 •-TCTAGA-3 • > to create the plasmid pCWlOB. 
This plasmid contains unique Nco I, Sma 1^ Kpn I and 

10 Xba I sites directly behind the phaseolin promoter. The 
1.4 kb Eco RV/Sma I fragment from pSF2-169K was ligated 
into the Sma I site of the commercially available 
phagemid pBC SK+ (Stratagene) . A phagemid with the cDNA 
in the desired orientation was selected by digesting 

15 with Pfl Ml/Xho I to yield fragments of approx* 1 kb and 
4 kb and designated pMl-SF2 . The 1.4 kb Xmn I/Xba I 
fragment from pMl-SF2 was inserted into the Sma I/Xba I 
sites of pCWlOB to yield the plasmid pBSll^ which has 
the soybean delta-12 desaturase cDNA in the reverse 

20 (3' -5*) orientation behind the phaseolin promoter. The 
plasmid pBSll was digested with Bam HI and a 3.07 kb 
fragment, representing the phaseolin promoter/antisense 
desaturase cDNA/phaseolin 3" end transcriptional unit 
was isolated by agarose gel electrophoresis arid ligated 

25 , into the Hind III site of pMLlS (described above) . When 
the resulting plasmids were digested with Xba I, 
plasmids containing inserts in the desired orientation 
yielded 2 fragments of 8.01 and 1.18 kb. A plasmid with 
the transcriptional unit in the correct orientation was 

30 selected and was desiigrnated pBS14. 

The vector pCW109A contains the soybean 
b-conglycinin promoter sequence and the phaseolin 3* 
untranslated region and is a modified version of vector 
pCW109 which was derived from the commercially available 

35 plasmid pUC18 (Gibco-BRL) . .The vector pCW109 was made 
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by inserting into the Hind III site of the cloning 
vector pUC18 a 555 bp 5' non-coding region (containing 
the promoter region) of the b-conglycinin gene followed 
by the multiple cloning sequence containing the 
5 restriction endonuclease sites for Nco I, Sma I, Kpn I 
and Xba I, as described for pCWlOS above, then 1174 bp 
of the .coxnmon bean phaseolin 3* untranslated region into 
the Hind III site (described above) • The b-conglycinin 
promoter region used is an allele of the published 

10 b-conglycinin gene (Doyle et al.^ J, Biol. Chem. (1986) 
261:9228-9238) due to differences at 27 nucleotide 
positions. Further sequence description of this gene 
may be found in Slightom (WO 9113993) . To facilitate 
use in antisense constructions, the Nco I site and 

15 potential translation start site in the plasmid pCW109 
was destroyed by digestion with Nco I, mung bean 
exonuclease digestion and re-ligation of the blunt site 
to give the modified plasmid pCW109A. The plasmid 
pCW109A was digested with Hind III and the resulting 

20 1.84 kb fragment, which contained the b-conglycinin/ 
antisense delta-12 desaturase cDNA/phaseolin 3* 
untranslated region, was gel isolated. The plasmid 
pML18 (described above) was digested with Xba I, filled 
in using Klenow and religated, in order to remove the 

25 Xba I site- The resulting plasmid was designated pBS16. 
The 1-84 kb fragment of plasmid pCW109A (described 
above) was ligated into the Hind III site of pBS16. A 
plasmid containing the insert in the desired orientation 
yielded a 3.53 kb and 4.41 kb fragment when digested 

30 with Kpn I and this plasmid was designated pCST2. The 
Xmn I/Xba I fragment of pMLl-SF2 (described above) was 
ligated into the Sma I/Xba I sites of pCST2 to yield the 
vector pSTll. 
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Transformation Of Somatic Soybean Embryo Cultures 
and Regeneration Of Soybean Plants 

Soybean embryogenic suspension cultures were 
maintained in 35 mL liquid media (SB55 or SBP6) on a 
rotary shaker, 150 rpm, at 28^C with mixed florescent 
and incandescent lights on a 16:8 h day /night schedule. 
Cultures were subcultured every four weeks by 
Inoculating approximately 35 mg of tissue into 35 mL of 
licpiid medium. 

Soybean embryogenic suspension cultures were 
transformed with pCSBFdSTlR by the method of particle 
gun bombardment (see Kline et al, (1987) Nature (London) 
327:70). A DuPont Biolistic PDSIOOO/HE instrument 
(helium retrofit) was used for these transformations. 

To 50 mL of a 60 mg/mL 1 mm gold particle 
suspension was added (in order); 5 uL DNA(1 ug/uL) , 
20 uL spermidine (0.1 M) , and 50 ul CaCl2 (2.5 M) . The 
particle preparation was agitated for 3 min, spun in a 
microfuge for 10 sec and the supernatant removed. The 
DNA-coated particles were then washed once in 400 uL 70% 
ethanol and re suspended in 40 uL of anhydrous ethanol. 
The DNA/particle suspension was sonicated three times 
for 1 sec each. Five uL of the DNA-coated gold 
particles were then loaded on each macro carrier disk. 

Approximately 300-400 mg of a four week old 
suspension culture was placed in an empty 60x15 mm petri 
dish and the residual liquid removed from the tissue 
with a pipette. For each transformation experiment^ 
approximately . 5-10 plates of tissue were normally 
bombarded. Membrane rupture pressure was set at 
1000 psi and the chamber was evacuated to a vacuum of 
28 inches of mercury. The tissue was placed 
approximately 3 . 5 inches away from the retaining screen 
and bombarded three times. Following bombardment, the 
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tissue was placed back into liquid and cultured as 
described above. 

Eleven days post bombardment , the liquid media was 
exchanged with fresh SB55 containing 50 mg/mL 
5 hygromycin. The selective media was refreshed weekly. 
Seven weeks post bombardment, green, transformed tissue 
was observed growing from untransformed, necrotic 
embryogenic clusters. Isolated green tissue was removed 
and inoculated into individual flasks to generate new, 

10 clonally propagated, transformed embryogenic suspension 
cultures. Thus each new line was treated as independent 
transformation event- These suspensions can then be 
maintained as suspensions of embryos clustered in an 
immature developmental stage through subculture or 

15 regenerated into whole plants by maturation and 
germination of individual somatic embryos. 

Transformed embryogenic clusters were removed from 
liquid culture and placed on a solid agar media (SB103) 
containing no hormones or antibiotics. Embryos were 

20 cultured for eight weeks at 26*^C with mixed florescent 
and incandescent lights on a 16:8 h day /night schedule. 
During this period, individual embryos were removed from 
the clusters and analyzed at various stages of embryo 
development After eight weeks somatic embryos become 

25 suitable for germination. For germination, eight week 
old embryos were removed from the maturation medium and 
dried in empty petri dishes for 1 to 5 days. The dried 
embryos were then planted in SB71-1 medium were they 
were allowed to germinate under the same lighting and 

30 germination conditions described above. Germinated 
embryos were transferred to sterile soil and grown to 
maturity for seed collection. 
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Media: 

SB55 and SBP6 Stock 
Solutions 

(g/L) 2 

MS. Sulfate lOOX Stock 
MgS04 7H20 37.0 
MnS04 H20 1.69 
ZnS04 7H2p 0.86 
CUSO4 5H20 0.0025 

MS Hal ides 10 OX Stock 
CaCl2 2H2O 44.0 
KI 0.083 
C0CI2 6H2O 0.00125 
KH2PO4 17.0 
H3BO3 0 • 62 

Na2Mo04 2H2O 0.025 

MS FeEDTA lOOX Stock 
Na2EDTA 3.724 
FeS04 7H2O 2.784 



TABLE IQ 

35 Vitamin Stock 
10 g m-inositol 
100 mg nicotinic acid 
100 mg pyridoxine HCl 
1 g thiamine 
SB55 (per Liter) 

10 mL each MS stocks 
1 mL B5 Vitamin stock 
0.8 g NH4NO3 
3.033 g KNO3 

1 mL 2,4-D (lOmg/mL stock) 
60 g sucrose 

0.667 g asparagine 
pH 5.7 

For SBP6- substitute 0.5 mL 
2,4-D 

SB103 (per Liter) 
MS Salts 

6% maltose 
750 mg MgCl2 

0.2% Gelrite 
pH 5.7 



SB71-1 (per liter) 
B5 salts 
1ml B5 vitamin stock 
3% sucrose 
750mg MgC12 
0.2% gelrite 
pH 5.7 
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Analysis Of Tr ansaenig Glycine Max EmbrvQS and 
Seeds Con i^ainina An Antisense Delta-15 Desaturase: 
Demon sf_ rat, inn That: The Phenotvoe Of Transgenic Soybean 
Soma ti in Emb ryos Is Predictilve Of The Phenofcvpe Of Seeds 
5 Derived Frnm Plants Reoeneratied From Those EmbrvQS 

While in the globular eihbryo state in liquid 
culture as described above, somatic soybean embryos 
contain very low amounts of triacylglycerol or storage 
proteins typical of maturing, zygotic soybean embryos. 

10 At this developmental stage, the ratio of total 

triacylglyceride to total polar lipid (phospholipids and 
glycolipid) is about 1:4, as is typical of zygotic 
soybean embryos at the developmental stage from which 
the somatic embryo culture was initiated • At the 

15 globular stage as well, the mRNAs for the prominent seed 
proteins (alpha" siibunit of beta-conglycinin, Kunitz 
Trypsin Inhibitor 3 and Soybean Seed Lectin) are 
essentially absent. Upon transfer to hormone free media 
to allow differentiation to the maturing somatic embryo 

20 state as described above, triacylglycerol becomes the 
most abundant lipid class. As well, mRNAs for alpha 
subunit of beta-conglycinin, Kunitz Trypsin Inhibitor 3 . 
and Soybean Seed Lectin become very abundant messages in 
the total mRNA population. In these respects the 

25 somatic soybean embryo system behaves very similarly to 
maturing zygotic soybean embryos In vivo, and is 
therefore a good and rapid model system for analyzing 
the phenotypic effects of modifying the expression of 
genes in the fatty acid biosynthesis pathway. 

30 Furthermore, the model system is predictive of the fatty 
acid composition of seeds from plants derived from 
transgenic embryos. Liquid culture globular embryos 
transformed with a vector containing a soybean 
microsomal delta-15 desaturase, in a reverse orientation 

35 and under the control of soybean conglycinin promoter 
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(pCSSFdST IR) , gave rise to mature embryos with a 
reduced 18:3 content (WO 9311245) . A number of embryos 
from line A2872 (control tissue transformed with pCST) 
and from lines 299/1/3, 299/15/1, 303/7/1, 306/3/1, 
5 306/4/3, 306/4/5 (line 2872 transformed with plasmid 

pCS3FdSTlR) were analyzed for fatty acid content. Fatty 
acid analysis was performed as described In WO 9311245 
using single embryos as the tissue source. Mature, 
somatic embryos from each of these lines were also 

10 regenerated into soybean plants by transfer to 

regeneration medium as described above. A number of 
seeds taken from plants regenerated from these embryo 
lines were analyzed for fatty acid content. The 
relative fatty-acid composition of embryos taken from 

15 tissue transformed with pCS3FdSTlR was conqpared with 
relative fatty-acid composition of seeds taken from 
plants derived from embryos transformed with . pCS3FdSTlR . 
Also, relative fatty acid compositions of embryos and 
• seeds transformed with pCS3FdSTlR were compared with 

20 control tissue, transformed with pCST. In all cases 
where a reduced 18:3 content was seen in a transgenic 
embryo line, compared with the control, a reduced 18:3 
content was also observed in segregating seeds of. plants 
derived from that line, when conqpared with the control 

25 seed (Table 11) . 

TMLE II 

Antisense Delta-15 Desaturase: 
Relative 18:3 Content Of Embryos And Seeds Of Control 
fA2172\ &nrt Tiran5.y^T.-in f2^9-. 306-1 Soybean Lines 

Soybean 

Line Ernbrvo Embryo Seed Seed 

av.%18:3 lowest %18:3 av.%18:3* lowest %18:3 

A2872 12.1 (2.6) 8.5 8.9 (0.8) 8.0 
(control) 

299/1/3 5.6 (1.2) 4.5 4.3 (1.6) 2.5 

299/15/1 8.9 (2.2) 5.2 2.5 (1.8) 1.4 
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303/7/1 7.3 (1.1) 5.9 4.9 (1.9) 2.8 

306/3/1 7.0 (1,9) 5.3 2.4 (1.7) 1.3 

306/4/3 8.5 (1.9) 6.4 4.5 (2.2) 2.7 

306/4/5 7.6 (1.6) 5.6 4.6 (1.6) 2.7 

* Seeds which were segregating with wild-type phenotype and 
without a copy of the transgene are not included in these 
averages. The nuniber in brackets is S.D., n»10. 

Thus the Applicants conclude that an altered 
polyunsaturated fatty acid phenotype observed in a 
transgenic, mature somatic embryo line is predictive of 
5 an altered fatty acid composition of seeds of plants 
derived from that line. 

AnaTys-is Of Transg finir Glycine Max Kmbrvos Containing 
An Antiis^nse Mi^cr r^^nmr^J Df>lfa-12 Desaturasft Construct 
The vectors pBSlS, pBS14 and pSTll contain the 

10 soybean microsomal aelta-12 desaturase cDNA, in the 

antlsense orientation, under the control of the soybean 
Kunitz Trypsin Inhibitor 3 (KTi3) , Phaseolus phaseolin, 
and soybean beta-conglycinin promoters as described 
above. Liquid culture globular embryos transformed with 

15 vectors pBSlS, pBS14 and pSTll, gave rise to mature 

embryo lines as described above. Fatty acid analysis 
was performed as described in WO 9311245 using single, 
mature embryos as the tissue source. A number of 
embryos from line A2872 (control tissue transformed with 

20 pCST) and from line A2872 transformed with vectors 
pBS13, pBS14 and pSTll were analyzed for fatty acid 
content . About 30% of the transformed lines showed an 
increased 18:1 content when compared with control lines 
transformed with pCST described above, demonstrating 

25 that the delta-12 desaturase had been inhibited in these 
lines. The remaining transformed lines showed relative 
fatty acid compositions similar to those of the control 
line. The relative 18:1 content of the lines showing an 
increased 18:1 content was as high as 50% compared with 
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a maximum of 12.5% in the control embryo lines. The 
average 18:1 conterit of embryo lines which showed an 
increased 18:1 contient was about 35% (Table 11) • In all 
the lines showing an increased 18:1 content there was a 
5 proportional decrease in the relative 18:2 content 

(Table 12). The. relative proportions of the other major 
fatty acids (16:0, 18:0 and 18:3) were similar to those 
. of the control . 

TABLE 12 

Summary Of Experiment In Which Soybean Embryos Were 
Transformed with Plasmids Containing A Soybean Antisense 









# of lines 






# of 

Vector lilnes 


with high 
18*1 


highest 


av. (%) 


pCST 

(control) 






12.5 


10.5 


pBS13 


11 


4 


53.5 


35.9 


pBS14 


11 


2 . 


48.7 


32.6 


pSTll 


11 


3 


50.1 


35.9 



In Tcible 12 the average 18:1 of transgenics is the 
average of all embryos transformed with a particular 
vector whose relative 18:1 content is greater than two 
standard deviations from the highest control value 
15 (12.5). The control average is the average of ten A2872 
embryos (standard deviation » 1,2) • The data in 
Table 12 are derived from Table 13 below. 
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TABLE 13 

Relative Fatty Acid Contents Of Embryo Lines 
Transformed With Plasmids Containing A 
Soybean Antisense Delta-12 Deaaturasft cDNA 

Embryo 

Line Relative % Fatty-Acid Content 

A2872 (control) 



# 


16 


:0 


18 : 


: 0 


18 


: 1 


18 : 


:2 


18 : 


: 3 


1 


11 


.7 


3. 


.2 


11 


. 7 


52 


. 7 


16 


. 1 


2 


16 


.4 


4 . 


.0 


10 


• 8 


47 


. 1 


19 . 


» 3 


3 


17 


. 1 


3 . 


p 4 


8 


• 3 


48 . 


» 3 


20 . 


. 6 


4 


15 


-> 


2 . 


. 7 




• 4 




. 1 




. U 


5 


15 


• 2 


3. 


. 6 


10 


. 8 


51 . 


, 0 


17 , 


.5 


6 


18 


. 6 


3 . 


. 9 


10 


• 9 


45 , 


. 8 


18 < 


. 1 


7 


14 


. 6 


3 . 


. 4 


12 


• 5 


52 , 


. 3 


16. 


. 4 


8 


14 


.2 


3 . 


.5 


11 


• 2 


53 . 


. 9 


16. 


. 7 


9 


15 


• 2 


3 , 


.2 


9 


• 8 


49 . 


» 5 


16 , 


► 1 


10 


19 


.0 


3. 


. 8 


9 


. 6 


47 , 


, 4 


19 . 


, 0 


G335/4/197 


(pBS13) 




















# 


lo 


: 0 


18 : 


: 0 


18 


: 1 


lo i 




Xo : 




1 


12 


. 2 




Q 

. o 




. U 




. u 


X / . 


> 4 


2 


12 


.4 


2, 


.7 


22 


.4 


39. 


, 0 


21 . 


. 9 


3 


12 


.0 


3. 


.2 


42 


.0 


23. 


,2 


18. 


.4 


G335/4/221 


(pBS13) 




















« 


16 


:0 


18: 


;0 


18 


:1 


18: 


:2 


18: 


;3 


1 


12 


.2 


2. 


.7 


30 


.4 


36. 


,0 


17. 


,9 


2 


11 


.5 


2. 


,4 


14 


.3 


53. 


,4 


17. 


.6 


3 


13 


.0 


2. 


.6 


15 


.2 


47. 


,4 


19. 


,9 


4 


12 


.0 


2. 


.6 


27 


.4 


37. 


.9 


19. 


,1 


5 


11 


.7 


2. 


.7 


25 


.1 


42. 


.3 


15. 


.6 


6 


11 


.7 


3. 


,4 


21 


.6 


44. 


.3 


17. 


.8 


7 


12 


.0 


2. 


,5 


11 


.3 


53. 


,6 


20. 


,0 


8 


12 


.0 


2. 


,5 


20 


.8 


44. 


,1 


19. 


,5 


9 


11 


.7 


2. 


,6 


25 


.3 


39. 


.6 


18. 


.3 
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G335/8/174 (pBS13) 
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• 0 
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* 0 


18 


. 1 
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:2 


18 


:3 


X 


1 1 

X X 


7 


X 1 




30 


1 


32 , 


. 4 


23 


.3 




1 1 
X X 








48 




20 


g 

» w 


16 


, 1 




x^ 


Q 






46 


. 6 


17 4 


^ 1 


19 


.5 




12 


7 


2 


> 6 


32 


• 0 


31 4 


. 1 


20 


.5 


5 




Q 


X 1 


Q 


4 X 


7 


4 




18 


Q 
• 7 


6 


12 


• 3 




. o 




• X 




» o 


1 7 
X / 


Q 
• 7 


7 


11 


• 3 


2 . 


> 4 


53 


e 


lo • 


. o 


X4 


c 
• 3 


8 


11 


• 4 


2 , 


c 

. D 


Id 


c 
• D 


ZX i 


. / 


X / 


• D 


9 


lU 




Z < 


> U 


43 


A 

• 4 


• 


o 

> ^ 


1 ft 
X o 




10 


12 


• 8 


2 . 


. 2 


43 


• ^ 


^3 • 


c 

> O 


X D 


• 7 


G335/6/42 


(pBS14) 




















# 


16 


: 0 


lo : 


! U 


Xo 


: X 


Xo : 


; z 


xo 


• O 

• O 


1 


13 


. 7 


2 , 


► 4 


3o 


• o 


Zo < 




X3 


• o 


2 


12 


. 6 


2 . 


. 3 


37 


. D 


4 


Q 

> O 


1 7 
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3 


11 


.7 


3 . 


. 0 


A Q 

48 


. / 


^X 4 


» X 


1 A 
X4 


• o 


G335/6/104 


(pBS14) 




















# 


Id 


: u 
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; u 


xo 


• X 


xo I 


> £t 


1 8 
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1 


13 


Q 
• Q 


^ . 


c 

• o 


n 
ou 


• D 


O 4 


A 


16 

X V 


0 


2 


12 


• 3 


2 . 


. 3 


X4 


• D 


3^ 4 




X o 


A 
• ^ 


3 


12 


. 7 


2 , 


> D 


- ^ / 


• X 


«9 O 4 


> D 






4 


12 


. 6 


2 . 


.2 


32 


* 1 


34 . 




17 


• 4 


5 


12 


.7 


2. 


.6 


23 


.2 


41, 


.2 


19 


.3 


6 


12 


.6 


2. 


.2 


11 


• 7 


52. 


.5 . 


20 


.1 


7 


13 


.3 


2. 


.1 


23 


.3 


41. 


.2 


18 


.4 


G335/1/25 


(pSTll) 




















# 


16 


:0 


18; 


:0 


18 


:1 


18; 


:2 


18 


:3 


1 


13 


.7 


2 


.8 


50 


.7 


17 


.5 


12 


. 1 


2 


14 


.5 


3 


.0 


41 


.8 


23 


.5 


15 


.0 
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o 


1 "9 Q 


9 Q 






13 6 


A 

H 


■ ^ 


^ • o 


47 . 5 


19 3 


14 . 8 


G335/2/7/1 


(pSTll) 










# 


16:0 


lo : U 


Id : 1 


lo : z 


X o • O 


1 


15 . 5 




zx • o 


•3o • u 


X / • 3 


2 


17.8 


4 . 1 


22 


39 . 5 


14 • 0 


3 . 


15.2 


3.0 


20.5 


42.2 


16.5 


6335/2/118 


(pSTll) 










« 


16:0 


18:0 


18:1 


18:2 


18:3 


1 


14.1 


2.7 


44,7 


22.6 


14.0 


2 


15.8 


2.8 


37.7 


26.9 


14.8 


3 


17.3 


3.4 


23.3 


37.9 


16.0 



N.B. All Other transformed embryos (24 lines) had fatty 
acid profiles similar to those of the control. 

One of these embryo lineS/ G335/1/25, had an 
5 average 18:2 content of less than 20% and an average 
18:1 content greater than 45% (and as high as 53.5%). 
The Applicants expect r based oh the data in table ?, 
that seeds derived from plants regenerated from such 
lines will have an equivalent or greater increase in 
10 18:1 content and an equivalent or greater increase 
decrease in 18:2 content. 

EXAMPLE 

RypRESSTQN OF MICROSO MAL DELTA-12 DESATURASE IN CANOLA 
Cnnstiriictiion Of Vector s For Transformation of 
15 Rrassica Napus For Reduced Expression Qf 

Microsomal Delta-1 ? Desaturases 
in D<>velQnina Canola Seeds 
An extended poly A tail was removed from the canola 
delta-12 desaturase sequence contained in plasmid 
20 pCF2-165D and additional restriction sites for cloning 
were introduced as follows. A PCR primer was 
synthesized corresponding to bases 354 through 371 of 
SEQ ID NO: 3. The second PCR primer was synthesized as 
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the complement to bases 1253 through 1231 with 15 
additional bases (GCAGATATCGCGGCC) added to the 5' end. 
The additonal bases encode both an EcoRV site and a Not I 
site. PCF2-165D was used as the ten«>late for PGR 
5 amplification using these primers. The 914 base pair 

product of PGR amplification was digested with EcoRV and 
PflMI to give an 812 base pair product corresponding to 
bases 450 through 1253 of pCF2-165D with the added NotI 
site . 

10 pCF2-165D was digested with Pstl^ the PstI overhang 

was blunted with Klenow fragment and then digested with 
PflMI. The 3.5 kB fragment corresponding to pBluescript 
along with the 5' 450 bases of the cahola Fad2 6DNA was 
gel purified and ligated to the above described 812 base 

15 pair fragment. The ligation product was amplified by 
* transformation of E. coli and plasmid DNA isolation. 
The EcoRI site remaining at the cloning junction between 
pBluescript and the canola Fad2 cDNA was destroyed by 
digestion^ blunting and religation. The recovered 

20 plasmid was called pM2CFd2 . 

pM2CFd2 was digested with EcoRV and Smal to remove 
the Fad2 insert as a blunt ended fragment. The fragment, 
was gel purified and cloned into the Smal site of pBC 
(Stratagene, La Jolla, CA) . A plasmid with the NotI 

25 site introduced by PGR oriented away from the existing 
NotI site in pBC was identified by NotI digestion and 
gel fractionation of the digests. The resulting 
construct then had NotI sites at both ends of the canola 
Fad2 cDNA fragment and was called pM3CFd2. 

30 Vectors for transformation of the antisense 

cytoplasmic delta-12 desaturase constructions under 
control of the B-conglycininr Kunitz trypsin inhibitor 
III, napin and phaseolin promoters into plants using 
A grobact e r i um tiumefaciens were produced by constructing 

35 a binary Ti plasmid vector system (Sevan, (1984) Nucl. 
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Acids Res. 12:8711-8720). One starting vector for the 
system, (pZS199) is based on a vector which contains: 
(1) the chimeric gene nopaline synthase/neomycin 
phosphotransferase as a selectable marker for 
5 transformed plant cells (Erevan et al. (1984) Nature 
304: 184-186) f (2) the left and right borders of the 
T-rDNA of the Ti plasmid (Brevan et al. (1984) Nucl. 
Acids Res. 12:8711-8720), (3) the E. CQli lacZ 
a-complementing segment (Vieria and Messing (1982) Gene 

10 19:259-267) with unique restriction endonuclease sites 
for ECO RI, Kpn I, Bam HI, and Sal I, (4) the bacterial 
replication origin from the Pseudomonas plasmid pVSl 
(Itoh et al. (1984) Plasmid 11:206-220)^ and (5) the 
bacterial neomycin phosphotransferase gene from Tn5 

15 (Berg et al . (1975) Proc. Natnl. Acad. Sci. U.S.A. 
* 72:3628-3632) as a selectable marker for transformed 
^. tumef aciens , The nopaline synthase promoter in the 
plant selectable marker was replaced by the 35S promoter 
(Odell et al. (1985) Nature, 313:810-813) by a standard 

20 restriction endonuclease digestion and ligation 

strategy. The 35S promoter is required for efficient 
Byassica na pus transformation as described below. A 
second vector (pZS212) was constructed by reversing the 
order of restriction sites in the unique site cloning 

25 region of pZS199 

Canola napin promoter expression cassettes were 
consturcted as follows : Ten oligonucleotide primers 
were synthesized based upon the nucleotide sequence of 
napin lambda clone CGNl-2 piablished in European Patent 

30 Application EP 255378) . The oligonucleotide sequences 
were : 

• BR42 and BR43 corresponding to bases 1132 to 1156 

(BR42) and the complement of bases 224 8 to 2271 (BR43) 
of the sequence listed in Figure 2 of EP 255378. 
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• BR45 and BR4 6 corresponding to bases 1150 to 1170 
(BR4 6) and the complement of bases 2120 to 2155 (BR45) 
of the sequence listed in Figure 2 of EP 255378. In 
addition BR46 had bases corresponding to a Sal I site 

5 (5*-GTCGAC-3') and a few additional bases 

(5«-TCAGGCCT-3') at its 5' end and BR45 had bases 
corresponding to a Bgl II site (5 •-AGATCT-3 • > and two 
{5'-CT-3') additional bases at the 5' end of the 
primer r 

10 • BR47 and BR4 8 corresponding to bases 2705 to 2723 

(BR47) and bases 2643 to 2666 (BR48) of the sequence 
listed in Figure 2 of EP 255378. In addition BR47 had 
two (5'-CT-3M additional, bases at the 5' end of the 
primer followed by bases corresponding to a Bgl II 

15 site (5*-AGATCT-3M followed by a few additional bases 

(5'-TCAGGCCT-3M f 

• BR49 and BR50 corresponding to the complement of bases 

3877 to 3897 (BR4 9) and the complement of bases 3985 
to 3919 (BR50) of the sequence listed in Figure 2 of 
20 EP 255378. In addition BR49 had bases corresponding 

to a Sal I site ( 5 " -GTCGAC-3 • ) and a few additional 
bases ( 5 • -TCAGGCCT-3 ' ) at its 5' end, 

• BR57 and BR58 corresponding to the complement of bases 

3875 to 3888 (BR57) and bases 2700 to 2714 (BR58) of 
25 the sequence listed in Figure 2 of EP 255378. In 
addition the 5' end of BR57 had some extra bases 
(5 •-CCATGG-3 • ) followed by bases corresponding to a 
Sac I site (5 • -GAGCTC-3 ' ) followed by more additional 
bases (5 '-GTCGACGAGG-3 • ) . The 5* end of BR58 had 
30 additional bases ( 5 ' -GAGCTC-3 • ) followed by bases 

corresponding to a Nco I site (5'-CCATGG-3M followed 
by additional bases (5 '-AGATCTGGTACC-3 • ) . 

• BR61 and BR62 corresponding to bases 184 6 to 1865 
(BR61) and bases 2094 to 2114 (BR62) of the sequence 

35 listed in Figure 2 of EP 255378. In addition the 5' 
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end of BR 62 had additional bases (5'-GACA-3') 
followed by bases corresponding to a Bgl II site 
(5'-AGATCT-3 ' ) followed by a few additional bases 
{5'-GCGGCCGC-3M . 
5 Genomic DNA from the canola variety *Hyola401* 

(Zeneca Seeds) was used as a template for PCR 
amplification of the napin promoter and napin terminator 
regions. The promoter was first amplified using primers 
BR42 and BR43, and reamplified using primers BR45 and 

10 BR4 6. Plasmid pIMCOl was derived by digestion of the 

1.0 kb promoter PCR product with Sall/Bglll and ligation' 
into Sall/BamHI digested pBluescript SK"*" (Stratagene) . 
The napin terminator region was amplified using primers 
BR48 and BR50, and reanqplified using primers BR47 and 

15 BR4 9. Plasmid pIMC06 was derived by digestion of the 
1.2 kb terminator PCR product with Sall/Bglll and 
ligation into Sall/Bglll digested pSP72 (Promega) . 
Using plMCOS as a template, the terminator region was 
reamplified by PCR using primer BR57 and primer BR58. 

20 Plasmid pIMClOl containing both the napin promoter and 

terminator was generated by digestion of the PCR product 
with Sacl/Ncol and ligation into Sacl/Ncol digested 
pIMCOl. Plasmid pIMClOl contains a 2.2 kb napin 
expression cassette including complete napin 5" and 3* 

25 non-translated sequences and an introduced Ncol site at 
the translation start ATG. Primer BR61 and primer BR62 
were used to PCR amplify an -270 bp fragment from the 3* 
end of the napin promoter. Plasmid pIMC401 was obtained 
by digestion of the resultant PCR product with 

30 EcoRI/Bglll and ligation into EcoRI/Bglll digested 
pIMClOl. Plasmid pIMC401 contains a 2.2 kb napin 
expression cassette lacking the napin 5* non-translated 
sequence and includes a NotI site at the transcription 
start . 
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To construct the antisense expression vector, 
pM3CFd2 was digested with NotI as was pIMC401. The 
delta-12 desaturase containing insert from the digest of 
pM3CFd2 was gel isolated and ligated Into the NotI 
5 digested and phosphatase treated pIMC401. An isolate in 
which the delta-12 desaturase was oriented antisense to 
the napin promoter was selected by digestion with Xhol 
and PflMI to give plasmid pNCFd2R. pNCFd2R was digested 
with Sallr phosphatase treated and ligated into pZS212 

10 which had been opened by the same treatment • A plasmid 
with desired orientation of the introduced 
napin: delta-12 desaturase antisense transcription unit 
relative to the selectable marker was chosen by 
digestion with Pvul and the resulting binary vector was 

15 given the name pZNCFd2R. 

Plasmid pML70 (described in Example 6 above) was 
digested with Ncol, blunted then digested with Kpnl . 
Plasmid pM2CFd was digested with Kpnl and Smal and the 
isolated fragment ligated into the opened pML70 to give 

20 the antisense expression cassette pMKCFd2R. The . 

promoter : delta-12 desaturase; terminator sequence was 
removed from pMKCFd2R by BamHI digestion and ligated 
into pZS199 which had been BamHI digested and 
phosphatase treated. The desired orientation relative 

25 to the selectable marker was determined by digestion 
with Xhol and PflMI to give the expression vector 
pZKCFd2R. 

The expression vector containing the B-conglicinin 
promoter was constructed by Smal and EcoRV digestion of 

30 pM2CFd2 and ligation into Smal cut pML109A. An isolate 
with the antisense orientation was identified by 
digestion with Xhol and Pflml, and the transcription 
unit was isolated by Sail and EcoRI digestion. The 
isolated Sall-EcoRI fragment was ligated into EcoRI-Sall 

35 digested p2S199 to give pCCFd2R. 
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The expression vector containing the phaseolin 
promoter was obtained using the same proceedure with 
pCWlOS as the starting, promoter containing vector and 
pZS212 as the binary portion of the vector to give 
5 pZPhCFd2R. 

Agrrobact^ylum-M^diaf 
Tyangformai^ion Of Brassica Napua 
The binary vectors pZNCFd2Rr pZCCFd2R, pZPhCFd2Rr 
and pZNCFd2R were transferred by a freeze/thaw method 
10 (Holsters et al. (1978) Mol Gen Genet 163:181-187) to 

the Agrobacterlum strain LBA4404/pAL4404 (Hockema et al." 

(1983) r Nature 303:179-180). 

Brasslca najaia. cultivar "Westar" was transformed by 

co-cultivation of seedling pieces with disarmed 
15 Acrynbaeterlum tumefaclens Strain LBA4404 carrying the 

the appropriate binary vector. 

a. na pus seeds were sterilized by stirring in 10% 

Chlorox^ 0.1% SDS for thirty min^ and then rinsed 

thoroughly with sterile distilled water. The seeds were • 
20 germinated on sterile medium containing 30 mM CaCl2 and 

1,5% agar, and grown for six days in the dark at 24**C. 
Liquid cultures of Agfohaet^erium for plant 

transformation were grown overnight at 28**C in Minimal A 

medium containing 100 mg/L kanamycin. The bacterial 
25 cells were pelleted by centrifugation and resuspended at 

a concentration of 10® cells/mL in liquid Murashige and 

Skoog Minimal Organic medium containing 100 \XM aceto- 

syringone . 

a. napus seedling hypocotyls were cut into 5 mm 
30 segments which were immediately placed into the 

bacterial suspension. After 30 min, the hypocotyl 
pieces were removed from the bacterial suspension and 
placed onto BC-35 callus medium containing 100 ^M 
acetosyringone • The plant tissue and Aarobacteria were 
35 co-cultivated for three days at 24**C in dim light. 
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The CO- cultivation was terminated by transferring 
the hypocotyl pieces to BC-35 callus medium containing 
200 mg/L carbeniciliin to kill the Agrobacterift/ and 
25 mg/L kanamycin to select for transformed plant cell 
5 growth. The seedling pieces were incubated on this 
medium for three weeks at 2B^C under continuous light. 

After four weeks, the segments were transferred to 
BS-48 regeneration medium containing 200 mg/L 
carbeniciliin and 25 mg/L kanamycin. Plant tissue was 

10 subcultured every two weeks onto fresh selective 

regeneration medium, under the same culture conditions 
described for the callus medium. Putatively transformed 
calli grew rapidly on regeneration medium; as calli 
reached a diameter of about 2 mm, they were removed from 

15 the hypocotyl pieces and placed on the same medium 
lacking kanamycin. 

Shoots began to appear within several weeks after 
transfer to BS-48 regeneration medium. As soon as the 
shoots formed discernable stems, they were excised from 

20 the calli, transferred to MSV-IA elongation medium, and 
moved to a 16:8 h photoperiod at 24®C. 

Once shoots had elongated several internodes, they 
were cut above the agar surface and the cut ends were 
dipped in Rootone. Treated shoots were planted directly 

25 into wet Metro-Mix 350 soiless potting medium. The pots 
were covered with plastic bags which were removed when 
the plants were clearly growing = — after about ten days. 

Plants were grown under a 16:8 h photoperiod, with 
a daytime temperature of 23®C and a night ime temperature 

30 of ll'^Q. When the primary flowering stem began to 

elongate, it was covered with a mesh pollen-containment 
bag to prevent outcrossing. Self-pollination was 
facilitated by shaking the plants several times each 
day. Fifty-one plants have thus far been obtained from 

35 transformations using both p2CCFd2R and p2PhCFd2R, 40 
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plants have been obtained from pZKCFd2R and 2 6 from 
pZNCFd2R. 

M7"-iTnal A Bg ^ol-f^rHal Growth Medium 

Dissolve in distilled water: 
5 10.5 grams potassixim phosphate, dibasic 

4.5 grams potassium phosphate, monobasic 
1.0 gram ammonium sulfate 
0.5 gram sodium citrate, dihydrate 
Make up to 979 mL with distilled water 
10 Autoclave 

Add 20 mL filter-sterilized 10% sucrose 
Add 1 mL filter-sterilized 1 M MgS04 
ffr-aftfilna rallus Medium BC-35 
Per liter: 

15 Murashige and Skoog Minimal Organic Medium (MS 

salts, 100 mg/L i-inositol, 0.4 mg/L thiamine; 

GIBCO #510-3118) 

30 grams sucrose 

18 grams mannitol 
20 0.5 mg/L 2,4-D 

0.3 mg/L kinetin 

0.6% agarose 

pH 5.8 

BT-««fl^r>a Rf >g^n^ rat :^nn Medium BS-48 
25 Murashige and Skoog Minimal Organic Medium 

Gamborg B5 Vitamins (SIGMA #1019) 

10 grams glucose 

250 mg xylose 

600 mg MES 
30 0.4% agarose 

pH 5.7 

Filter-sterilize and add after autoclaving: 
2.0 mg/L zeatin 
0.1 mg/L lAA 
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Braaalca Shoot: ElonaaHion Medium MSV-IA 

Murashige and Skoog Minimal Organic Medium 
Gamborg B5 Vitamins 
10 . grams sucrose 
5 0.6% agarose 

pH 5.8 

ft pa lysis Of Transgenic Br assica Napufi Seeds Containing 
An Antiisense Mioroso mal Delta-12 Desaturase Construct 
Fifty-one plants were obtained from transformation 

10 with both pZPhCFd2R and pZCCFd2R, 40 were obtained from 
pZKCFd2R, and 26 from pZNCFd2R. The relative levels of 
oleate (18:1), linoleate (18:2) and linolinate (18:3) 
change during development so that reliable determination 
Qf seed fatty acid. phenotype is best obtained from seed 

15 which has undergone nomal maturation and drydown. 

Relatively few transformed plants have gone through to 
maturity, however seeds were sampled from plants which 
had been transferred to pots for at least 80 days and 
which had pods that had yellowed and contained seeds 

20 with seed coats which had black pigmentation. Plants 
were chosen for early anlaysis based on promoter type, 
presence and copy number of the inserted delta-12 
desaturase antisense gene and fertility of the plant. 

Fatty acid analysis was done on either individual 

25 seeds from transformed and control plants, or on 40 mg 
of bulk seed from individual plants as described in 
Example 6. Southern analysis for detection of the 
presence of canola delta-12 desaturase antisense genes 
was done on DNA obtained from leaves of transformed 

30 plants. DNA was digested either to release the 
promoter: delta-12 desaturase fragment from the 
transformation vector or to cut outside the coding 
region of the delta-12 desaturase antisense gene, but 
within the left and right T-DNA borders of the vector. 



35 
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Relative Fatty Acid Profiles of Microsomal Delta-12 Desaturase 

Ari»4 «<>n««> Tga nnf^rfned anri rftn»-r>r>1 Ri^aaaiea Waotifl Seeds 

% o£ TOTAL FATTY ACIDS 



PIAHT .* 


FROMOX&E 


f!OPY# 


AGE* 


16;0 


18*0 


ia;l 


tfl;2 




Westar 


control 


none 


82 


4.6 


1.2 


64.6 


20.9 


6.6 


151-22 


phaseolin 


>8 


82 


4.4 


1.0 


76.6 


10.0 


6.2 


158-8 


napin 


1 


83 


3.5 


1.5 


81.3 


6.3 


4.6 


westar 


control 


none 


106 


4.1 


1,7 


64.4 


19.9 


7.1 


151-22 


phaseolin 


>8 


106 


4.2 


1.9 


74.4 


9.9 


6.3 


151 --127 


phaseolin 


0 


106 


4.1 


2.3 


68.4 


16.9 


5.2 


151-268 


phaseolin 


1 


106 


4.2 


2.7 


73.3 


12.0 


4.2 


153-83 


conglycinin 


2 


106 


4.1 


1.6 


68.5 


16.7 


6.3 



*Seed san^ling date in days after the plant was tranf erred to 
soil 

The expected fatty acid phenotype for antisense 
suppression of the delta-12 desaturase is decreased 
relative content of 18:2 with a corresponding increase 
5 in 18:1. Plant numbers 151-22 and 158-8 both show a 
substantial decrease in 18:2 content of bulk seed when 
compared to the westar control at 83 days after 
planting. Plant 151-22 also, shows this difference at 
maturity in comparison to either the westar control or 
10 plant 151-127, which was transformed with the selectable 
marker gene but not the delta-12 desaturase antisense 
gene . 

Since the fatty acid analysis was done on seeds 
from the primary transformant, individual seed should be 

15 segregating for the presense of the transgene copy or 

copies. The segregating phenotypes serve as an internal 
control for the effect of the delta-12 desaturase 
antisense gene. The relative fatty acid phenotypes for 
10 individual westar seeds, 10 individual 151-22 seeds 

20 and 12 individual 158-8 seeds are given in Table 15 
below. 
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TABLE 15 

Relative Fatty Acid Profiles for Individual Seeds 
of Control and Genetically Segregating Delta- 12 
nii^fiat-nrafie Tyanfl-Fnrmed Brassica NaPUS Seeds 





ie;0 


18:1 


IB -.2 


18:3 


4.65 


1.05 


63.45 


21.31 


7.29 


4.65 


1.37 


65.41 


20.72 


6.18 


3.86 


1.31 


62.19 


22.50 


8.18 


4.46 


1.41 


66.81 


19.40 


5.63 


4.76 


1.30 


61.90 


22.39 


7.65 


4.59 


1.10 


64.77 


20.62 


6.56 


4.61 


1.16 


68.66 


18.20 


5.07 


4.71 


1.26 


67.28 


19.32 


5.18 


4.67 


0.98 


61.96 


22.93 


7.61 


4.73 


1.33 


63.85 


21.65 


6.23 






lSl-22 






16i0 


18;0 




18i2 


18; 3 


4.56 


1.08 


73.40 


12.40 


7.60 


4.25 


1.20 


77.90 


10.00 


5.40 


4 .40 


1.00 


76.90 


10.10 


5.90 


4.40 


0.94 


77 .40 


9.40 


6.10 


4.50 


1.00 


73.60 


11.30 


7.90 


4.60 


0.98 


75.40 


10.50 


6.50 


4.49 


0.96 


76.70 


9.90 


6.00 


4.20 


1.10 


77.20 


9.70 


5.50 


4.20 


1.00 


80.00 


7.90 


4.90 


4.50 


1.00 


78.00 


8.80 


5.80 






158-8 






16;Q 


18;0 


18:1 


18:2 


18;3 


3.62 


1.67 


84 .45 


3.60 


3.73 


3.46 


1.64 


85.56 


3.02 


3.36 


3.48 


1.61 


83.64 


4.43 


4.21 


3.53 


1.40 


83.80 


4.41 


4.36 


3.48 


1.39 


83.66 


4.35 


4.44 
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3.80 


1.50 


68.17 


16.57 


7.56 


3.41 


1.40 


83.76 


4.38 


. 4.40 


3.49 


1.29 


82.77 


5.16 


4.60 


3.77 


1.39 


69.47 


16.40 


6.54 


3.44 


1.36 


83.86 


4.49 


4.27 


3.48 


1.38 


83.15 


4.91 


4.53 


3.55 


1.92 


83.69 


4.20 


3.70 



The westar control shows comparatively little seed 
to seed variation in content of 18:1 or 18:2. Further 
the ratio of 18:3/18:2 remains very constant between 
5 seeds at about 0.35, Plant #158-8 should show a 
segregation ratio of either 1:2:1 or 1:3 since by 
Southern analysis it contains a single transgene. The 
1:2:1 ratio would indicate a semi-dominant ^ copy number 
effect while the 1:3 ratio would indicate conplete 

10 dominance. Two wild type 158-8 segregants are clear iTi 
Table 15, while the remaing seeds may either be the 
same, or the two seeds at greater than 84% 18:1 may 
represent the homozygous transgeneic. In either case 
the fatty acid phenotypes of the seeds are as expected 

15 for effective delta-12 desaturase suppression in this 
generation. The fatty acid phenotypes of the seeds of 
plant 151-22 show variation in their 18:1 and 18:2 
content, with 18:1 higher than the control average and 
18:2 lower. The segregation is apparently quite 

20 complex, as would be expected of a multi-copy transgenic 
plant . 
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SEOTTRNnE I.TSTTNG 
GENERAL INFORMATION: 
(i) APPLICANT: 



E. I. DU PONT D£ NEMOURS 
AND COMPANY 



(ii) TITLE OF INVENTION: 



GENES FOR MICROSOMAL 
FATTY ACID DELTA- 12 
DESATXJRASES AND 
RELATED ENZYMES FROM 
PLANTS 



(111) NUMBER OF SEQUENCES: 17 
(iv) CORRESPONDENCE ADDRESS:. 

(A) ADDRESSEE: E. I. DU PONT DE NEMOURS 

AND COMPANY 

(B) STREET: 1007 MARKET STREET 
<C) CITY: WILMINGTON 

( D ) STATE : DELAWARE 

(E) COUNTRY: U.S.A. 

(F) ZIP: 19898 

(v) COMPUTER READABLE FORM: 

(A) . MEDIUM TYPE: Floppy disk 

(B) COMPUTER: Macintosh 

(G) OPERATING SYSTEM: Macintosh System, 

6.0 

(D) SOFTWARE: Patentin Release #1.0, 
Version #1.25 



(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION : 



BB-1043-A 



(vll) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: U.S. 07/977,339 

(B) FILING DATE: 17-NOV-1992 

(vlll) ATTORNEY/AGENT INFORMATION: 

<A) NAME: Morrissey, Bruce W 

(B) REGISTRATION NUMBER: 330,663 

<C) REFERENCE /DOCKET NUMBER: BB-1043-A 
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(ix) TELECOMMUNICATION INFORMATION:. 



(A) TELEPHONE: (302) 992-4927 

(B) TELEFAX: (302) .892-7949 

(C) TELEX: 835420 



(2) INFORMATION FOR SEQ ID N0:1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1372 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : double 

(D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: cDNA 



(iii) HYPOTHETICAL: NO 
(iv) ANtl-SENSE: NO 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Arabidopsis thaliana 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: p92103 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 93.. 1244 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

AGAGAG21GAG ATTCTGCGGA GGAGCTTCTT CTTCGTAGGG TGTTCATCGT TATTAACGTT 60 

ATCGCCCCTA CGTCAGCTCC ATCTCCAGAA AC ATG GGT GCA GGT GGA AGA ATG 113 

Met Gly Ala Gly Gly Arg Met 
1 5 



CCG GTT CCT ACT TCT TCC AAG AAA TCG GAA ACC GAC ACC ACA AAG CGT 161 
Pro Val Pro Thr Ser Ser Lys Lys Ser Glu Thr Asp Thr Thr Lys Arg 
10 15 20 

GTG CCG TGC GAG AAA CCG CCT TTC TCG GTG GGA GAT CTG AAG AAA GCA. 209 
Val Pro Cys Glu Lys Pro Pro Phe Ser Val Gly Asp Leu Lys Lys Ala 
25 30 35 

ATC CCG CCG CAT TGT TTC AAA CGC TCA ATC CCT CGC TCT TTC TCC TAC 257 
lie Pro Pro His Cys Phe Lys Arg Ser lie Pro Arg Ser Phe Ser Tyr 
40 45 50 55 
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CTT ATC AGT GAC ATC ATT ATA GCC TCA TGC TTC TAC TAG GTC GCC ACC 305 
Leu lie Ser Asp He He He Ala Ser Cys Phe Tyr Tyr Val Ala Thr 
60 65 70 

AAT TAC TTC TCT CTC CTC CCT CAG CCT CTC TCT TAC TTG GCT TGG CCA 353 
Asn Tyr Phe Ser Leu Leu Pro Gin Pro Leu Ser Tyr Leu Ala Trp Pro 
75 80 85 

CTC TAT TGG GCC TGT CAA GGC TGT GTC CTA ACT GGT ATC TGG GTC ATA 401 
Leu Tyr Trp Ala Cys Gin Gly Cys Val Leu Thr Gly He Trp Val He 
90 95 100 

GCC CAC GAA TGC GGT CAC CAC GCA TTC AGC GAC TAC CAA TGG CTG GAT 449 
Ala His Glu Cys Gly His His Ala Phe Ser Asp Tyr Gin Trp Leu Asp 
105 110 115 

GAC ACA GTT GGT CTT ATC TTC CAT TCC TTC CTC CTC GTC CCT TAC TTC 497 
Asp Thr Val Gly Leu He Phe His Ser Phe Leu Leu Val Pro Tyr Phe 
120 125 130 135 

TCC TGG AAG TAT AGT CAT CGC CGT CAC CAT TCC AAC ACT GGA TCC CTC 545 
Ser Trp Lys Tyr Ser His Arg Arg His His Ser Asn Thr Gly Ser Leu 
140 145 150 

GAA AGA GAT GAA GTA TTT GTC CCA AAG CAG AAA TCA GCA ATC AAG TGG 593 
Glu Arg Asp Glu Val Phe Val Pro Lys Gin Lys Ser Ala He Lys Trp 
155 160 165 

TAC GGG AAA TAC CTC AAC AAC CCT CTT GGA CGC ATC ATG ATG TTA ACC 641 
Tyr Gly Lys Tyr Leu Asn Asn Pro I.eu Gly Arg He Met Met Leu Thr 
170 175 180 

GTC CAG TTT GTC CTC GGG TGG CCC TTG TAC TTA GCC TTT AAC GTC TCT 689 
Val Gin Phe Val Leu Gly Trp Pro Leu Tyr I^eu Ala Phe Asn Val Ser 
185 190 195 

GGC AGA CCG TAT GAC GGG TTC GCT TGC CAT TTC TTC CCC AAC GCT CCC 737 
Gly Arg Pro Tyr Asp Gly Phe Ala Cys His Phe Phe Pro Asn Ala Pro 
200 205 210 215 

ATC TAC AAT GAC CGA GAA CGC CTC CAG ATA TAC CTC TCT GAT GCG GGT 785 
He Tyr Asn Asp Arg Glu Arg Leu Gin He Tyr Leu Ser Asp Ala Gly 
220 225 230 

ATT CTA GCC GTC TGT TTT GGT CTT TAC CGT TAC GCT GCT GCA CAA GGG 833 
He Leu Ala Val Cys Phe Gly Leu Tyr Arg Tyr Ala Ala Ala Gin Gly 
235 240 245 

ATG GCC TCG ATG ATC TGC CTC TAC GGA GTA CCG CTT CTG ATA GTG AAT 881 
Met Ala Ser Met He Cys Leu Tyr Gly Val Pro Leu Leu He Val Asn 
250 255 260 

GCG TTC CTC GTC TTG ATC ACT TAC TTG CAG CAC ACT CAT CCC TCG TTG 929 
Ala Phe Leu Val Leu He Thr Tyr Leu Gin His Thr His Pro Ser Leu 
265 270 275 
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CCT CAC TAG GAT TCA TCA GAG TGG GAC TGG CTC AGG GGA GOT TTG GCT 977 
Pro His Tyr Asp Ser Ser Glu Trp Asp Trp Leu Arg Gly Ala Leu Ala 
280 285 290 295 

ACC GTA GAC AGA GAC TAC GGA ATC TTG AAC AAG GTG TTC CAC AAC ATT 1025 
Thr Val Asp Arg Asp Tyr Gly He Leu'Asn Lys Val Phe His Asn lie 
300 305 310 

ACA GAC ACA CAC GTG GCT CAT CAC CTG TTC TCG ACA ATG CCG CAT TAT 1073 
Thr Asp Thr His Val Ala His His Leu Phe Ser Thr Met Pro His Tyr 
315 320 325 

AAC GCA ATG GAA GCT ACA AAG GCG ATA AAG CCA ATT CTG GGA GAC TAT 1121 
Asn Ala Met Glu Ala Thr Lys Ala He Lys Pro lie Leu Gly Asp Tyr 
330 335 340 

TAC CAG TTC GAT GGA ACA CCG TGG TAT GTA GCG ATG TAT AGG GAG GCA 1169 
Tyr Gin Phe Asp Gly Thr Prp Trp Tyr Val Ala Met Tyr Arg Glu Ala 
345 350 355 

AAG GAG TGT ATC TAT GTA GI^A CCG GAC AGG GAA GGT GAC AAG AAA GGT 1217 
Lys Glu Cys He Tyr val Glu Pro Asp Arg Glu Gly Asp Lys Lys Gly 
360 365 370 375 

GTG TAC TGG TAC AAC AAT AAG TTA TGAGCATGAT GGTGAAGAAA TTGTCGACCT 1271 
Val Tyr Trp Tyr Asn Asn Lys Leu 
380 

TTCTCTTGTC TGTTTGTCTT TTGTTAAAGA AGCTATGCTT CGTTTTAATA ATCTTATTGT 1331 
CCATTTTGTT GTGTTATGAC ATTTTGGCTG CTCATTATGT T 1372 

(2) INFORMATION FOR SEQ ID NO:2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 383 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Gly Ala Gly Gly Arg Met Pro Val Pro Thr Ser Ser Lys Lys Ser 
15 10 15 

Glu Thr Asp Thr Thr Lys Arg Val Pro Cys Glu Lys Pro Pro Phe Ser 

20 25 30 

Val Gly Asp Leu Lys Lys Ala lie Pro Pro His Cys Phe Lys Arg Ser 
35 40 45 

lie Pro Arg Ser Phe Ser Tyr Leu lie Ser Asp lie He He Ala Ser 
50 55 60 
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Cys Phe Tyr Tyr Val Ala Thr Asn Tyr Phe Ser Leu Leu Pro Gin Pro 
65 70 75 80 

Leu Ser Tyr Leu Ala Trp Pro Leu Tyr Trp Ala Cys Gin Gly Cys Val 
85 90 95 

Leu Thr Gly lie Trp Val lie Ala His Glu Cys Gly His His Ala Phe 
100 105 110 

Ser Asp Tyr Gin Trp Leu Asp Asp Thr Val Gly Leu lie Phe His Ser 
115 120 125 

Phe Leu Leu Val Pro Tyr Phe Ser Trp Lys Tyr Ser His Arg Arg His 
130 135 140 

His S^r Asn Thr Gly Ser Zieu Glu Arg Asp Glu Val Phe Val Pro Lys 
145 150 155 160 

Gin Lys Ser Ala lie Lys Trp Tyr Gly Lys Tyr Leu Asn Asn Pro Leu 
165 170 175 

Gly Arg lie Met Met Leu Thr Val Gin Phe Val Leu Gly Trp Pro Leu 
180 185 190 

Tyr Leu Ala Phe Asn Val Ser Gly Arg Pro Tyr Asp Gly Phe Ala Cys 
195 200 205 

His Phe Phe Pro Asn Ala Pro lie Tyr Ash Asp Arg Glu Arg Leu Gin 
210 215 220 

lie Tyr Leu Ser Asp Ala Gly lie Leu Ala Val Cys Phe Gly Leu Tyr 
225 230 235 240 

Arg Tyr Ala Ala Ala Gin Gly Met Ala Ser Met lie Cys Leu Tyr Gly 
245 250 255 

Val Pro Leu Leu lie Val Asn Ala Phe Leu Val Leu lie Thr Tyr Leu 
260 265 270 

Gin His Thr His Pro Ser Leu Pro His Tyr Asp Ser Ser Glu Trp Asp 
275 280 285 

Trp Leu Arg Gly Ala Leu Ala Thr Val Asp Arg Asp Tyr Gly lie Leu 
290 295 300 

Asn Lys Val Phe His Asn lie Thr Asp Thr His Val Ala His His Leu 
305 310 315 320 

Phe Ser Thr Met Pro His Tyr Asn Ala Met Glu Ala Thr Lys Ala lie 
325 330 335 

Lys Pro lie Leu Gly Asp Tyr Tyr Gin Phe Asp Gly Thr Pro Trp Tyr 
340 345 350 

Val Ala Met Tyr Arg Glu Ala Lys Glu Cys lie Tyr Val Glu Pro Asp 
355 360 365 
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Arg Glu Gly Asp Lys Lys Gly Val Tyr Trp Tyr Asn Asn Lys Leu 
370 375 380 

(2) INFORMATION FOR SEQ ID N0:3; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1394 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Brassica napus 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: pCF2-165D 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 99.. 1250 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: 

GAGAGGAGAC AGAGACAGAG AGAGAGTTGA GAGAGCTCTC GTAGGTTATC GTATTAACGT 60 

AATCTTCAAT CCCCCCTACG TCAGCCAGCT CAAGAAAC ATG GGT GCA GGT GGA 113 

Met Gly Ala Gly Gly 
1 5 

AGA ATG CAA GTG TCT COT CCC TCC AAA AAG TOT GAA AGO GAG AAC ATG 161 
Arg Met Gin Val Ser Pro Pro Ser Lys Lys Ser Glu Thr Asp Asn lie 
10 15 20 

AAG CGC GTA CCC TGC GAG ACA CCG CCC TTC ACT GTC GGA GAA CTC AAG 209 
Lys Arg Val Pro Cys Glu Thr Pro Pro Phe Thr Val Gly Glu Leu Lys 
25 30 35 

AAA GCA ATC CCA CCG CAC TGT TTC AAG CGC TCG ATC CCT CGC TCT TTC 257 
Lys Ala lie Pro Pro His Cys Phe Lys Arg Ser lie Pro Arg Ser Phe 
40 45 50 

TCC CAC CTC ATC TGG GAG ATG ATG ATA GGC TCC TGC TTC TAG TAG GTG 305 
Ser His Leu lie Trp Asp lie lie He Ala Ser Cys Phe Tyr Tyr Val 
55 60 65 



wo 94/11516 



PCr/US93/09987 



117 



GCC ACC ACT TAG TTC CCT CTC CTC CCT AAC CCT CTC TCC TAG TTG GGG 353 
Ala Thr Thr Tyr Phe Pro Leu Leu Pro Asn Pro Leu Ser Tyr Phe Ala 
70 75 80 85 

TGG GCT CTC TAG TGG GCC TGC CAG GGG TGC GTG CTA ACC GGG GTC TGG 401 
Trp Pro Leu Tyr Trp Ala Cys Gin Gly Cya Val Leu Thr Gly Val Trp 
90 95 100 

GTC ATA GCC CAC GAG TGC GGC CAC GCA GCC TTC AGC GAC TAG CAG TGG 449 
Val He Ala His Glu Cys Gly His Ala Ala Phe Ser Asp Tyr Gin Trp 
105 110 115 

GTG GAC GAC ACC GTC GGC GTC ATC TTG CAC TCC TTC CTC CTC GTC CCT 497 
Leu Asp Asp Thr Val Gly Leu He Phe His Ser Phe Leu Leu Val Pro 
120 125 130 

TAG TTG TCC TGG AAG TAG AGT GAT GGA GGG GAG CAT TGG AAC ACT GGC 545 
Tyr Phe Ser Trp Lys Tyr Ser His Arg Arg His His Ser Asn Thr Gly 
135 140 145 

TCC CTC GAG AGA GAG GAA GTG TTT GTG CCA AGA AGA AGT CAG ACA TGA 593 
Ser Leu Glu Arg Asp Glu Val Phe Val Pro Arg Arg Ser Gin Thr Ser 
150 155 160 165 

AGT GGT AGG GGA AGT AGC TGA AGA ACC TTT GGA GGC ACC GTG ATG TTA 641 
Ser Gly Thr Ala Ser Thr Ser Thr Thr Phe Gly Arg Thr Val Met Leu 
170 175 180 

AGG GTT GAG TTC AGT GTC GGG TGG GGT TTG TAG TTA GGG TTG AAG GTG 689 
Thr Val Gin Phe Thr Leu Gly Trp Pro Leu Tyr Leu Ala Phe Asn Val 
185 190 195 

TGG GGG AGA CCT TAG GAC GGC GGG TTG GCT TGG CAT TTG GAC GCC AAC 737 
Ser Gly Arg Pro Tyr Asp Gly Gly Phe Ala Cys His Phe His Pro Asn 
200 205 210 

GCT GCC ATC TAG AAC GAC GGT GAG CGT CTC CAG ATA TAG ATC TCC GAC 785 
Ala Pro He Tyr Asn Asp Arg Glu Arg Leu Gin He Tyr He Ser Asp 

21.5 220 225 

GGT GGG ATG CTC GCC GTG TGG TAG GGT GTG CTA GGG TAG GGT GGT GTG 833 
Ala Gly He Leu Ala Val Cys Tyr Gly Leu Leu Pro Tyr Ala Ala Val 
230 235 240 245 

GAA GGA GTT GGG TGG ATG GTC TGC TTG CTA GGA GTT CGT GTT GTG ATT 881 
Gin Gly Val Ala Ser Met Val Cys Phe Leu Arg Val Pro Leu Leu He 
250 255 260 

GTC AAC GGG TTC TTA GTT TTG ATC ACT TAC TTG CAG CAG AGG CAT CCT 929 
Val Asn Gly Phe Leu Val Leu He Thr Tyr Leu Gin His Thr His Pro 
265 270 275 

TCC GTG CGT CAG TAT GAG TGG TGT GAG TGG GAT TGG TTG AGG GGA GCT 977 
Ser Leu Pro His Tyr Asp Ser Ser Glu Trp Asp Trp Leu Arg Gly Ala 
280 285 290 
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TTG GCC ACC GTT GAC AGA GAC TAG GGA ATC TTG AAC CAA GGC TTC CAC 1025 
Leu Ala Thr Val Asp Arg Asp Tyr Gly He Leu Asn Gin Gly Phe His 
295 300 305 

AAT ATC ACG GAC ACG CAC GAG GCG CAT CAC CTG TTC TCG ACC ATG CCG 1073 
Asn He Thr Asp Thr His Glu Ala His His Leu Phe Ser Thr Met Pro 
310 315 320 325 

CAT TAT CAT GCG ATG GAA GCT ACG AAG GCG ATA AAG CCG ATA CTG GGA 1121 
His Tyr His Ala Met Glu Ala Thr Lys Ala He Lys Pro lie Leu Gly 
330 335 340 

GAG TAT TAT CAG TTC GAT GGG ACG pCG GTG GTT AAG GCG ATG TGG AGG 1169 
Glu Tyr Tyr Gin Phe Asp Gly Thr Pro Val Val Lys Ala Met Trp Arg 
345 350 355 

GAG GCG AAG GAG TGT ATC TAT GTG GAA CCG GAC AGG CAA GGT GAG AAG 1217 
Glu Ala Lys Glu Cys He Tyr Val Glu Pro Asp Arg Gin Gly Glu Lys 
360 365 370 

AAA GGT GTG TTC TGG TAC AAC AAT AAG TTA TGAAGCAAAG AAGAAACTGA 1267 
Lys Gly Val Phe Trp Tyr Asn Asn Lys Leu 
375 . 380 

ACCTTTCTCT TCTATCAATT GTCTTTGTTT AAGAAGCTAT GTTTCTGTTT CAATAATCTT 1327 
AATTATCCAT TTTGTTGTGT TTTCTGACAT TTTGGCTAAA ATTAT6TGAT GTTGGAAGTT ' 1387 
AGTGTCT 1394 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 383 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Gly Ala Gly Gly Arg Met Gin Val Ser Pro Pro Ser Lys Lys Ser 
1 5 10 15 

Glu Thr Asp Asn lie Lys Arg Val Pro Cys Glu Thr Pro Pro Phe Thr 
20 25 30 

Val Gly Glu Leu Lys Lys Ala lie Pro Pro His Cys Phe Lys Arg Ser 
35 40 45 



lie Pro Arg Ser Phe Ser His Leu lie Trp Asp lie He lie Ala Ser 
50 55 60 
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Cys Phe Tyr Tyr Val Ala Thr Thr Tyr Phe Pro Leu Leu Pro Asn Pro 
65 70 75 80 

Leu Ser Tyr Phe Ala Trp Pro Leu Tyr Trp Ala Cys Gin Gly Cys Val 
85 90 95 

Leu Thr Gly Val Trp Val He Ala His Glu Cys Gly His Ala Ala Phe 
100 105 110 

Ser Asp Tyr Gin Trp Leu Asp Asp Thr Val Gly Leu lie Phe His Ser 
115 120 125 

Phe Leu Leu Val Pro Tyr Phe Ser Trp Lys Tyr Ser His Arg Arg His 
130 135 140 

His Ser Asn Thr Gly Ser Leu Glu Arg Asp Glu Val Phe Val Pro Arg 
145 150 155 160 

Arg Ser Gin Thr Ser Ser Gly Thr Ala Ser Thr Ser Thr Thr Phe Gly 
165 170 175 

Arg Thr Val Met Leu Thr Val Gin Phe Thr Leu Gly Trp Pro Leu Tyr 
180 185 190 

Leu Ala Phe Asn Val Ser Gly Arg Pro Tyr Asp Gly Gly Phe Ala Cys 
195 200 205 

His Phe His Pro Asn Ala Pro He Tyr Asn Asp Arg Glu Arg Leu Gin 
210. 215 220 

He Tyr He Ser Asp Ala Gly He Leu Ala Val Cys Tyr Gly Leu Leu 
225 230 235 240 

Pro Tyr Ala Ala Val Gin Gly Val Ala Ser Met Val Cys Phe Leu Arg 
245 250 255 

Val Pro Leu Leu He Val Asn Gly Phe Leu Val Leu He Thr Tyr Leu 
260 265 270 

Gin His Thr His Pro Ser Leu Pro His Tyr Asp Ser Ser Glu Trp Asp 
275 280 285 

Trp Leu Arg Gly Ala Leu Ala Thr Val Asp Arg Asp Tyr Gly He Leu 
290 295 300 

Asn Gin Gly Phe His Asn He Thr Asp Thr His Glu Ala His His Leu 
305 310 315 320 

Phe Ser Thr Met Pro His Tyr His Ala Met Glu Ala Thr Lys Ala He 
325 330 335 

Lys Pro He Leu Gly Glu Tyr Tyr Gin Phe Asp Gly Thr Pro Val Val 
340 345 350 

Lys Ala Met Trp Arg Glu Ala Lys Glu Cys He Tyr Val Glu Pro Asp 
355 360 365 
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Arg Gin Gly Glu Lys Lys Gly Val Phe Trp Tyr Asn Asn Lys Leu 
370 375 380 



(2) INFORMATION FOR SEQ ID NO: 5! 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 62 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 
(iii) HYPOTHETICAL: NO 



(iv) ANTI-SENSE: NO 
(vi) • ORIGINAL SOURCE: 

(A) ORGANISM: Glycine max 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: pSF2-165K 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 108.. 1247 . 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

CCATATACTA ATATTTGCTT GTATTGATAG CCCCTCCGTT CCCAAGAGTA TAAAACTGCA 60 

TCGAATAATA CAAGCCACTA GGCATGQGTC TAGCAAAGGA AACAACA ATG GGA GGT 116 

Met Gly Gly 
1 



AGA GGT CGT GTG GCC AAA GTG GAA GTT CAA GGG AAG AAG CCT CTC TCA 164 
Arg Gly Arg Val Ala Lys Val Glu val Gin Gly Lys Lys Pro Leu Ser 
5 10 15 

AGG GTT CCA AAC ACA AAG CCA CCA TTC ACT GTT GGC CAA CTC AAG AAA 212 
Arg Val Pro Asn Thr Lys Pro Pro Phe Thr Val Gly Gin Leu Lys Lys 
20 25 30 35 

GCA ATT CCA CCA CAC TGC TTT CAG CGC TCC CTC CTC ACT TCA TTC TCC 260 
Ala lie Pro Pro His Cys Phe Gin Arg Ser Leu Leu Thr Ser Phe Ser 
40 45 50 



TAT GTT GTT TAT GAC CTT TCA TTT GCC TTC ATT TTC TAG ATT GCC ACC 
Tyr Val Val Tyr Asp Leu Ser Phe Ala Phe lie Phe Tyr lie Ala Thr 
55 60 65 



308 
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ACC TAG TTC CAC CTC CTT CCT CAA CCC TTT TCC CTC ATT GCA TGG CCA 356 
Thr Tyr Phe His Leu Leu Pro Gin Pro Phe Ser Leu lie Ala Trp Pro 
70 75 80 

ATC TAT TGG GTT CTC CAA GGT TGC CTT CTC ACT GGT GTG TGG GTG ATT 404 
lie Tyr Trp Val Leu Gin Gly Cys Leu Leu Thr Gly Val Trp Val lie 
85 90 95 

GCT CAC GAG TGT GGT CAC CAT GCC TTC AGC AAG TAC CAA TGG GTT GAT 452 
Ala His Glu Cys Gly His His Ala Phe Ser Lys Tyr Gin Trp Val Asp 
100 105 110 115 

GAT GTT GTG GGT TTG ACC CTT CAC TCA ACA CTT TTA GTC CCT TAT TTC 500 
Asp Val Val Gly Leu Thr Leu His Ser Thr Leu Leu Val Pro Tyr Phe 
120 125 130 

TCA TGG AAA ATA AGC CAT CGC CGC CAT CAC TCC AAC ACA GGT TCC CTT 548 
Ser Trp Lys lie Ser His Arg Arg His His Ser Asn Thr Gly Ser Leu 
135 . 140 145 

GAC CGT GAT GAA GTG TTT GTC CCA AAA CCA AAA TCC AAA GTT GCA TGG 596 
Asp Arg Asp Glu Val Phe Val Pro Lys Pro Lys Ser Lys Val Ala Trp 
150 155 160 

TTT TCC AAG TAC TTA AAC AAC CCT CTA GGA AGG GCT GTT TCT CTT CTC 644 
Phe Ser Lys Tyr Leu Asn Asn Pro Leu Gly Arg Ala Val Ser Leu Leu 
165 170 175 

GTC ACA CTC ACA ATA GGG TGG CCT ATG TAT TTA GCC TTC AAT GTC TCT 692 
Val Thr Leu Thr lie Gly Trp Pro Met Tyr Leu Ala Phe Asn Val Ser 
180 185 190 195 

GGT AGA CCC TAT GAT AGT TTT GCA AGC CAC TAC CAC CCT TAT GCT CCC 740 
Gly Arg Pro Tyr Asp Ser Phe Ala Ser His Tyr His Pro Tyr Ala Pro 
200 205 210 

ATA TAT TCT AAC CGT GAG AGG CTT CTG ATC TAT GTC TCT GAT GTT GCT 788 
lie Tyr Ser Asn Arg Glu Arg Leu I«eu lie Tyr Val Ser Asp Val Ala 
215 220 225 

TTG TTT TCT GTG ACT TAC TCT CTC TAC CGT GTT GCA ACC CTG AAA GGG 836 
Leu Phe Ser Val Thr Tyr Ser Leu Tyr Arg Val Ala Thr Leu Lys Gly 
230 235 240 

TTG GTT TGG CTG CTA TGT GTT TAT GGG GTG CCT TTG CTC ATT GTG AAC 884 
Leu Val Trp Leu Leu Cys Val Tyr Gly Val Pro I<eu Leu lie Val Asn 
245 250 255 

GGT TTT CTT GTG ACT ATC ACA TAT TTG CAG CAC ACA CAC TTT GCC TTG 932 
Gly Phe Leu Val Thr lie Thr Tyr Leu Gin His Thr His Phe Ala Leu 
260 265 270 275 

CCT CAT TAC GAT TCA TCA GAA TGG GAC TGG CTG AAG GGA GCT TTG GCA 980 
Pro His Tyr Asp Ser Ser Glu Trp Asp Trp Leu Lys Gly Ala Leu Ala 
280 285 290 
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ACT ATG GAC AGA GAT TAT GGG ATT CTG AAC AAG GTG TTT CAT CAC ATA 1028 
Thr Met Asp Arg Asp Tyr Gly He Leu Asn Lys Val Phe His His He 
295 300 305 

ACT GAT ACT CAT GTG GCT CAC CAT CTC TTC TCT ACA ATG CCA CAT !tAC 1076 
Thr Asp Thr His Val Ala His His Leu Phe Ser Thr Met Pro His Tyr 
310 315 320 

CAT GCA ATG GAG GCA ACC AAT GCA ATC AAG CCA ATA TTG GGT GAG TAC 1124 
His Ala Met Glu Ala Thr Asn Ala He Lys Pro He Leu Gly Glu Tyr 
325 330 335 

TAC CAA TTT GAT GAC ACA CCA TTT TAC AAG GCA CTG TGG AGA GAA GCG 1172 
Tyr Gin Phe Asp Asp Thr Pro Phe Tyr Lys Ala Leu Trp Arg Glu Ala 
340 345 350 355 

AGA GAG TGC CTC TAT GTG GAG CCA GAT GAA GGA ACA TCC GAG AAG GGC 1220 
Arg Glu Cys Leu Tyr Val Glu Pro Asp Glu Gly Thr Ser Glu Lys Gly 
360 365 370 

GTG TAT TGG TAC AGG AAC AAG TAT TGATGGAGCA ACCAATGGGC CATAGTGGGA 1274 
Val Tyr Trp Tyr Arg Asn Lys Tyr 

375 380 

GTTATGGAAG TTTTGTCATG TATTAGTACA TAATTAGTAG AATGTTATAA ATAAGTGGAT 1334 

TTGCCGCGTA ATGACTTTGT GTGTATTGTG AAACAGCTTG TTGCGATCAT GGTTATAATG 1394 

TAAAAATAAT TCTGGTATTA ATTACATGTG GAAAGTGTTC TGCTTATAGC TTTCTGCCTA 1454 

AAAAAAAA 1462 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 379 amino acids 

(B) TYPE: amino acid 
< D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

^t Gly Gly Arg Gly Arg Val Ala Lys Val Glu Val Gin Gly Lys Lys 
1 5 10 15 

Pro Leu Ser Arg Val Pro Asn Thr Lys Pro Pro Phe Thr Val Gly Gin 
20 25 30 



Leu Lys Lys Ala lie Pro Pro His Cys Phe Gin Arg Ser Leu Leu Thr 
35 40 45 
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Ser Phe Ser Tyr Val Val Tyr Asp Leu Ser Phe Ala Phe He Phe Tyr 
50 55 . 60 

He Ala Thr Thr Tyr Phe His Leu Leu Pro Gin Pro Phe Ser Leu He 
65 70 75 80 

Ala Trp Pro He Tyr Trp Val Leu Gin Gly Cys Leu Leu Thr Gly Val 
85 90 95 

Trp Val He Ala His Glu Cys Gly His His Ala Phe Ser Lys Tyr Gin 
100 105 110 

Trp Val Asp Asp Val Val Gly Leu Thr I«eu His Ser Thr Leu Leu Val 
115 120 125 

Pro Tyr Phe Ser Trp Ly? He Ser His Arg Arg His His Ser Asn Thr 
130 135 140 

Gly Ser Leu Asp Arg Asp Glu Val Phe Val Pro Lys Pro Lys Ser Lys 
145 150 155 160 

Val Ala Trp Phe Ser Lys Tyr Leu Asn Asn Pro Leu Gly Arg Ala Val 
165 170 175 

Ser Leu Leu Val Thr Leu Thr He Gly Trp Pro Met Tyr Leu Ala Phe 
180 185 190 

Asn Val Ser Gly Arg Pro Tyr Asp Ser Phe Ala Ser His Tyr His Pro 
195 200 205 

Tyr Ala Pro He Tyr Ser Asn Arg Glu Arg Leu Leu He Tyr Val Ser 
210 215 220 

Asp Val Ala Leu Phe Ser Val Thr Tyr Ser Leu TyJ^ Arg Val Ala Thr 
225 230 235 240 

Leu Lys Gly Leu Val Trp Leu Leu Cys Val Tyr Gly Val Pro Leu Leu 
245 250 255 

He Val Asn Gly Phe Leu val Thr He Thr Tyr Leu Gin His Thr His 
260 265 270 

Phe Ala Leu Pro His Tyr Asp Ser Ser Glu Trp Asp Trp Leu Lys Gly 
275 280 285 

Ala Leu Ala Thr Met Asp Arg Asp Tyr Gly He Leu Asn Lys Val Phe 
290 ' 295 300 

His His He Thr Asp Thr His Val Ala His His Leu Phe Ser Thr Met 
305 310 315 320 

Pro His Tyr His Ala Met Glu Ala Thr Asn Ala He Lys Pro He Leu 
325 330 335 

Gly Glu Tyr Tyr Gin Phe Asp Asp Thr Pro Phe Tyr Lys Ala Leu Trp 
340 345 350 



wo 94/1 1516 PCT/US93/a9987 

124 

Arg Glu Ala Arg Glu Cys Leu Tyr Val Glu Pro Asp Glu Gly Thr Ser 
355 360 365 

(31u Lys Gly Val Tyr Trp Tyr Arg Asn Lys Tyr 
370 375 

(2) INFORMATION FOR SEQ ID NO: 7; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1790 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Zea mays 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: pFad2#l 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 165.. 1328 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

CGGCCTCTCC CCTCCCTCCT CCCTGCAAAT CCTGCAGACA CCACCGCTCG TTTTTCTCTC 60 

CGGGACAGGA GAAAAGGGGA GAGAGAGGTG AGGCGCGGTG TCCGCCCGAT CTGCTCTGCC 120 

CCGACGCAGC TGTTACGACC TCCTCAGTCT CAGTCAGGAG CAAG ATG GGT GCC GGC 176 

Met Gly Ala Gly 
1 

GGC AGG ATG ACC GAG AAG GAG CGG GAG AAG CAG GAG CAG CTC GCC CGA 224 
Gly Arg Met Thr Glu Lys Glu Arg Glu Lys Gin Glu Gin Leu Ala Arg 
5 10 15 20 

GCT ACC GGT GGC GCC GCG ATG CAG CGG TCG CCG GTG GAG AAG CCT QCG 272 
Ala Thr Gly Gly Ala Ala Met Gin Arg Ser Pro Val Glu Lys Pro Pro 
25 30 35 
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TTC ACT CTG GGT CAG ATC AAG AAG GCC ATC CCG CCA CAC TGC TTC GAG 320 
Phe Thr Leu Gly Gin lie Lys Lys Ala lie Pro Pro His Cys Phe Glu 
40 45 50 

CGC TCG GTG CTC AAG TCC TTC TCG TAG GTG GTC CAC GAC CTG GTG ATC 368 
Arg Ser Val Leu Lys Ser Phe Ser Tyr Val Val His Asp Leu Val lie 
55 60 65 

GCC GC6 GCG CTC CTC TAC TTC GCG CTG GCC ATC ATA CCG GCG CTC CCA 416 
Ala Ala Ala Leu Leu Tyr Phe Ala Leu Ala lie He Pro Ala Leu Pro 
70 75 80 

AGC CCG CTC CGC TAC GCC GCC TGG CCG CTG TAC TGG ATC GCG CAG GGG 464 
Ser Pro Leu Arg Tyr Ala Ala Trp Pro Leu Tyr Trp He Ala Gin Gly 
85 90 95 100 

TGC GTG TGC ACC GGC GTG TGG GTC ATC GCG CAC GAG TGC GGC CAC CAC 512 
Cys Val Cys Thr Gly Val Trp Val He Ala His Glu Cys Gly His His 
105 110 115 

GCC TTC TCG GAC TAC TCG CTC CTG GAC GAC GTG GTC GGC CTG GTG CTG 560 
Ala Phe Ser Asp Tyr Ser Leu Leu Asp Asp Val Val Gly Leu Val Leu 
120 125 130 

CAC TCG TCG CTC ATG GTG CCC TAC TTC TCG TGG AAG TAC AGC CAC CGG . 608 
His Ser Ser Leu Met Val Pro Tyr Phe Ser Trp Lys Tyr Ser His Arg 
135 140 145 

CGC CAC CAC TCC AAC ACG GGG TCC CTG GAG CGC GAC GAG GTG TTC GTG 656 
Arg His His Ser Asn Thr Gly Ser Leu Glu Arg Asp Glu Val Phe Val 
150. 155 160 

CCC AAG AAG AAG GAG GCG CTG CCG TGG TAC ACC CCG TAC GTG TAC AAC 704 
Pro Lys Lys Lys Glu Ala Leu Pro Trp Tyr Thr Pro Tyr Val Tyr Asn 
165 170 175 180 

AAC CCG GTC GGC CGG GTG GTG CAC ATC GTG GTG CAG CTC ACC CTC GGG 752 
Asn Pro Val Gly Arg Val Val His He Val Val Gin Leu Thr Leu Gly 
185 190 195 

TGG CCG CTG TAC CTG GCG ACC AAC GCG TCG GGG CGG CCG TAC CCG CGC 800 
Trp Pro Leu Tyr Leu Ala Thr Asn Ala Ser Gly Arg Pro Tyr Pro Arg 
200 205 210 

TTC GCC TGC CAC TTC GAC CCC TAC GGC CCC ATC TAC AAC GAC CGG GAG 848 
Phe Ala Cys His Phe Asp Pro Tyr Gly Pro He Tyr Asn Asp Arg Glu 
215 220 225 

CGC GCC CAG ATC TTC GTC TCG GAC GCC GGC GTC GTG GCC GTG GCG TTC 896 
Arg Ala Gin He Phe Val Ser Asp Ala Gly Val Val Ala Val Ala Phe 
230 235 240 

GGG CTG TAC AAG CTG GCG GCG GCG TTC GGG GTC TGG TGG GTG GTG CGC 944 
Gly Leu Tyr Lys Leu Ala Ala Ala Phe Gly Val Trp Trp Val Val Arg 
245 250 255 26^0 
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GTG TAG GCC GTG CCG CTG CTG ATC GTG AAC GCG TGG CTG GTG CTC ATC 992 
Val Tyr Ala Val Pro Leu Leu lie Val Asn Ala Trp Leu Val Leu He 
265 270 275 

ACC TAG CTG GAG GAG AGG GAG GGG TGG GTG GGG GAG TAG GAG TGG AGC 1040 
Thr Tyr Leu Gin His Thr His Pro Ser Leu Pro His Tyr Asp Ser Ser 
280 285 290 

GAG TGG GAG TGG CTG CGC GGC GGG GTG GCG AGC ATG GAG CGC GAG TAG 1088 
Glu Trp Asp Trp Leu Arg Gly Ala Leu Ala Thr Met Asp Arg Asp Tyr 
295 300 305 

GGC ATC CTC AAC CGC GTG TTG GAG AAC ATC AGG GAG ACG CAC GTG GCG 1136 
Gly He Leu Asn Arg Val Phe His Asn He Thr Asp Thr His Val Ala 
310 315 320 

CAC CAC CTC TTC TGG ACC ATG CCG CAC TAG CAC GCC ATG GAG GCC ACC 1184 
His His Leu Phe Ser Thr Met Pro His Tyr His Ala Met Glu Ala Thr 
325 330 335 340 

AAG GGG ATC AGG CGC ATC CTC GGG GAG TAG TAC CAC TTC GAC CCG ACC ' 1232 
Lys Ala He Arg Pro He Leu Gly Asp Tyr Tyr His Phe Asp Pro Thr 
345 350 355 

CCT GTG GCG AAG GCG ACC TGG CGC GAG GCC GGG GAA TGG ATC TAC GTG 1280 
Pro Val Ala Lys Ala Thr Trp Arg Glu Ala Gly Glu Cys He Tyr Val 
360 365 370 

GAG CGC GAG GAC GGC AAG GGC GTG TTG TGG TAC AAC AAG AAG TTC TAGCGGGCGG 1335 
Glu Pro Glu Asp Arg Lys Gly Val Phe Trp Tyr Asn Lys Lys Phe 
375 380 385 



CGGTCGGAGA 


GCTGAGGAGG 


CTAGGGTAGG 


AATGGGAGCA 


GAAACCAGGA 


GGAGGAGACG 


1395 


GTAGTCGGCC 


CAAAGTCTGC 


GTCAACCTAT 


GTAATGGTTA 


GTCGTCAGTC 


TTTTAGACGG 


1455 


GAAGAGAGAT 


CATTTGGGGA 


GAGAGAGGAA 


GGCTTAGTGC 


AGTGCGATGG 


GTAGAGCTGC 


1515 


CATCAAGTAG 


AAGTAGGCAA 


ATTC6TGAAC 


TTAGTGTGTC 


CGATGTTGTT 


TTTCTTAGTC 


1575 


GTGGGGTGCT 


GTAGGCTTTC 


GGGCGGCGGT 


CGTTTGTGTG 


GTTGGCATGG 


GTGGGCATGC 


1635 


GTGTGCGTGC 


GTGGCCGCGC 


TTGTGGTGTG 


CGTCTGTCGT 


CGGGTTGGCG 


TCGTCTGTTG 


1695 


GTGGTGCCCG 


TGTGTTGTTG 


TAAAACAAGA 


AGATGTTTTC 




GGCGGAATAA 


1755 


CAGATCGTCC 


GAACGAAAAA 


AAAAAAAAAA 


AAAAA 






1790 



(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 387 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 8: . 

Met Gly Ala Gly Gly Arg Met Thr Glu Lys Glu Arg Glu Lys Gin Glu 
1 5 10 * 15 

Gin Leu Ala Arg Ala Thr Gly Gly Ala Ala Met Gin Arg Ser Pro Val 
20 25 30 

Glu Lys Pro Pro Phe Thr Leu Gly Gin lie Lys Lys Ala lie Pro Pro 
35 40 45 

His Cys Phe Glu Azg Ser Val Leu Lys Ser Phe Ser Tyr Val Val His 
SO 55 60 

Asp Leu Val lie Ala Ala Ala Leu Leu Tyr Phe Ala Leu Ala lie He 
65 70 75 80 

Pro Ala Leu Pro Ser Pro Leu Arg Tyr Ala Ala .Trp Pro Leu Tyr Trp 
85 90 ' 95 

He Ala Gin Gly Cys Val Cys Thr Gly Val Trp Val He Ala His Glu 
100 105 110 

Cys Gly His His Ala Phe Ser Asp Tyr Ser Leu Leu Asp Asp Val Val 
115 120 125 

Gly lieu Val Leu His Ser Ser Leu Met Val Pro Tyr Phe Ser Tzp Lys 
130 135 ' 140 

Tyr Ser His Arg Arg His His Ser Asn Thr Gly Ser Leu Glu Arg Asp 
145 150 155 ■ 160 

Glu Val Phe Val Pro Lys Lys Lys Glu Ala Leu Pro Trp Tyr Thr Pro 
165 170 175 

Tyr Val Tyr Asn Asn Pro Val Gly Arg Val Val His He Val Val Gin 
180 185 190 

Leu Thr Leu Gly Trp Pro Leu Tyr Leu Ala Thr Asn Ala Ser Gly Arg 
195 200 205 

Pro Tyr Pro Arg Phe Ala Cys His Phe Asp Pro Tyr Gly Pro He Tyr 
210 215 220 

Asn Asp Arg Glu Arg Ala Gin He Phe Val Ser Asp Ala Gly Val Val 
225 230 235 240 

Ala Val Ala Phe Gly Leu Tyr Lys Leu Ala Ala Ala Phe Gly Val Trp 
245 250 255 

Trp Val Val Arg Val Tyr Ala Val Pro Leu Leu He Val Asn Ala Trp 
260 265 270 

Leu Val Leu He Thr Tyr Leu Gin His Thr His Pro Ser Leu Pro His 
275 280 285 
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Tyr A3p Ser Ser Glu Trp Asp Trp Leu Arg Gly Ala Leu Ala Thr Met 
290 295 300 

Asp Arg Asp Tyr Gly lie Leu Asn Arg Val Phe His Asn lie Thr Asp 
305 310 315 320 

Thr His Val Ala His His Leu Phe Ser Thr Met Pro His Tyr His Ala 
325 330 335 

Met Glu Ala Thr Lys Ala He Arg Pro He Leu Gly Asp Tyr Tyr His 
340 345 350 

Phe Asp Pro Thr Pro Val Ala Lys Ala Thr Trp Arg Glu Ala Gly Glu 
355 360 365 

Cys He Tyr Val Glu Pro Glu Asp Arg Lys Gly Val Phe Trp Tyr Asn 
370 375 380 

Lys Lys Phe - 
385 . 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 673 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
( i i i ) HYPOTHETICAL : NO 
(iv) ANTI-SENSE: NO 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Ricinus coimnunis 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: pRF2-lC 
(ix) FEATURE: 

(A) NAME/KEY : CDS 

(B) LOCATION : 1 • . 673 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

TGG GXG AT6 GOG CAT GAT TGT GGG CAC CAT GCC TTC AGT GAC TAT CAA 48 
Trp VaX Met Ala His Asp Cys Gly His His Ala Phe Ser Asp Tyr Gin 
15 10 15 

TTG CTT GAT GAT GTA GTT GGT CTT ATC CTA CAC TCC TGT CTC CTT GTC 96 
Leu Leu Asp Asp Val Val Gly Leu lie Leu His Ser Qys Leu Leu Val 
20 25 30 

CCT TAT TTT TCA TGG AAA CAC AGC CAT CGC CGA CAT CAT TCC AAC ACA 144 
Pro Tyr Phe Ser Trp Lys His Ser His Arg Arg His His Ser Asn Thr 
35 40 45 

GGG TCC CTG GAA CGG GAT GAA GTG TTT GTT CCC AAG AAG AAA TCT AGT 192 
Gly Ser Leu Glu Arg Asp Glu Val Phe Val Pro Lys Lys Lys Ser Ser 
50 . 55 60 

ATC CGT TGG TAT TCC AAA TAC CTC AAC AAC CCT CCA GGT CGT ATC ATG 240 
lie Arg Trp Tyr Ser Lys' Tyr Leu Asn Asn Pro Pro Gly Arg lie Met 
65 .70 .75 80 

ACA ATT GCC GTC ACA CTT TCA CTT GGC TGG CCT CTG TAC CTA GCA TTC 288 
Thr lie Ala Val Thr Leu Ser Leu Gly Trp Pro Leu Tyr Leu Ala Phe 
85 90 95 . 

AAT GTT TCA GGC AGG CCA TAT GAT CGG TTC GCC TGC CAC TAT GAC CCA 336 
Asn Val Ser Gly Arg Pro Tyr Asp Arg Phe Ala Cys His Tyr Asp Pro 
.100 105 110 

TAT GGC CCG ATC TAC AAT GAT CGC GAG CGA ATC GAG ATA TTC ATA TCA 384 
Tyr Gly Pro lie Tyr Asn Asp Arg Glu Arg lie Glu lie Phe lie Ser 
115 120 125 

GAT GCT GGT GTT CTT GCT GTC ACT TTT GGT CTC TAC CAA CTT GCT ATA 432 
Asp Ala Gly Val Leu Ala Val Thr Phe Gly Leu Tyr Gin Leu Ala He 
130 135 140 

GCG AAG GGG CTT GCT TGG GTT GTC TGT GTA TAT GGA GTG CCA TTG TTG 480 
Ala Lys Gly Leu Ala Trp Val Val Cys Val Tyr Gly Val Pro Leu Leu 
145 150 155 160 

GTG GTG AAT TCA TTC CTT GTT CTG ATC ACA TTT CTG CAG CAT ACT CAC 528 
Val Val Asn Ser Phe Leu Val Leu He Thr Phe Leu Gin His Thr His 
165 170 175 

CCT. GCA TTG CCA CAT TAT GAT TCG TCG GAG TGG GAC TGG CTA AGA GGA 576 
Pro Ala Leu Pro His Tyr Asp Ser Ser Glu Trp Asp Trp Leu Arg Gly 
180 185 190 

GCT CTA GCA ACT GTT GAC AGA GAT TAC GGG ATC TTG AAC AAG GTG TTC 624 
Ala Leu Ala Thr Val Asp Arg Asp Tyr Gly He Leu Asn Lys Val Phe 
195 200 205 

CAT AAC ATA ACG GAC ACT CAA GTA GCT CAC CAC CTT TTC ACC ATG CCC C 673 
His Asn He Thr Asp Thr Gin Val Ala His His Leu Phe Thr Met Pro 
210 215 220 
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(2) INFORMATION FOR SEQ ID NO: 10: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 224 amino acids 

(B) TYPE: amino acid ^ 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Trp Val Met Ala His Asp Cys Gly His His Ala Phe Ser Asp Tyr Gin 
1 5 10 15 

Leu Leu Asp Asp Val Val Gly Leu lie Leu His Ser Cys Leu Leu Val 
20 25 30 

Pro Tyr Phe Ser Trp Lys His Ser His Arg Arg His His Ser Asn Thr 
35 40 45 

Gly Ser Leu Glu Arg Asp Glu Val Phe Val Pro Lys Lys Lys Ser Ser 
50 55 60 

lie Arg Trp Tyr Ser Lys Tyr Leu Asn Asn Pro Pro Gly Arg lie Met 
65 70 75 80 

Thr lie Ala Val Thr Leu Ser Leu Gly Trp Pro Leu Tyr Leu Ala Phe 
85 do 95 

Asn Val Ser Gly Arg Pro Tyr Asp Arg Phe Ala Cys His Tyr Asp Pro 
100 105 110 

Tyr Gly Pro He Tyr Asn Asp Arg Glu Arg He Glu He Phe He Ser 
115 120 125 

Asp Ala Gly Val Leu Ala Val Thr Phe Gly Leu Tyr Gin Leu Ala He 
130 135 140 

Ala Lys Gly Leu Ala Trp Val Val Cys Val Tyr Gly Val Pro Leu Leu 
145 150 155 160 

Val Val Asn Ser Phe Leu Val Leu He Thr Phe Leu Gin His Thr His 
165 170 175 

Pro Ala Leu Pro His Tyr Asp Ser Ser Glu Trp Asp Trp Leu Arg Gly 
180 185 190 

Ala Leu Ala Thr Val Asp Arg Asp Tyr Gly He Leu Asn Lys Val Phe 

195 200 205 

His Asn He Thr Asp Thr Gin Val Ala His His Leu Phe Thr Met Pro 
210 215 220 
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(2) INFORMATION FOR SEQ ID NO: 11: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1369 base pairs 

(B) TYPE: nucleic acid 
(0) STRANDEDNESS : double 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

( ii i ) HYPOTHETICAL : NO 

<iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Ricinus conununis 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: pRF197c-42 
(ix) FEATURE: 

(A) Ni^/KEY: CDS 

(B) LOCATION: 184.. 1347 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: . 



CGGCCGGGAT TCCGGTTTTC ACACTAATTT GCAAAAAATG CATGATTTCA CCTCAAATCA 60 

AACACCACAC CTTATAACTT AGTCTTAAGA GAGAG^IGAGA GAGGAGACAT TTCTCTTCTC 120 

TGAGATGAGC ACTTCTCTTC CAGACATCGA AGCCTCAGGA AAGTGCTTGA GAAGAGCTTG 180 

AGA ATG GGA GGT GGT GGT CGC ATG TCT ACT 6TC ATA ATC AGC AAC AAC 228 
Met Gly Gly Gly Gly Arg Met Ser Thr Val lie He Ser Asn Asn 

1 5 10 . .15 

AGT GAG AAG AAA GGA GGA AGC AGC CAC CTG GAG CGA GCG CCG CAC ACG 276 
Ser Glu Lys Lys Gly Gly Ser Ser His Leu Glu Arg Ala Pro His Thr 
20 25 30 

AAG CCT CCT TAC ACA CTT GGT AAC CTC AAG AGA GCC ATC CCA CCC CAT 324 
Lys Pro Pro Tyr Thr Leu Gly Asn Leu Lys Arg Ala He Pro Pro His 
35 40 45 

TGC TTT GAA CGC TCT TTT GTG CGC TCA TTC TCC AAT TTT GCC TAT AAT 372 
Cys Phe Glu Arg Ser Phe Val Arg Ser Phe Ser Asn Phe Ala Tyr Asn 
50 55 60 
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TTC TGC TTA AGT TTT CTT TCC TAC TCG ATC GCC ACC AAC TTC TTC CCT 420 
Phe Cys Leu Ser Phe Leu Ser Tyr Ser lie Ala Thr Asn Phe Phe Pro 
65 70 75 

TAC ATC TCT TCT CCG CTC TCG TAT GTC GCT TGG CTG GTT TAC TGG CTC 468 
Tyr lie Ser Ser Pro Leu Ser Tyr Val* Ala Trp Leu Val Tyr Trp Leu 
80 85 90 95 

TTC CAA GGC TGC ATT CTC ACT GGT CTT TGG GTC ATC GGC CAT GAA TGT 516 
Phe Gin Gly Cys lie Leu Thr Gly Leu Trp Val lie Gly His Glu Cys 
100 105 110 

GGC CAT CAT GCT TTT AGT GAG TAT CAG CTG GCT GAT GAC ATT GTT GGC 564 
Gly His His Ala Phe Ser Glu Tyr Gin Leu Ala Asp Asp He Val Gly 
115 120 125 

CTA ATT GTC CAT TCT GCA CTT CTG GTT CCA TAT TTT TCA TGG AAA TAT ,612 
Leu He Val His Ser Ala Leu Leu Val Pro Tyr Phe Ser Trp Lys Tyr 
130 135 140 

AGC CAT CGC CGC CAC CAT TCT AAC ATA GGA TCT CTC GAG CGA GAC GAA 660 
Ser His Arg Arg His His Ser Asn He Gly Ser Leu Glu Arg Asp Glu 
145 150 155 

GTG TTC GTC CCG AAA TCA AAG TCG AAA ATT TCA TGG TAT TCT AAG TAC 708 
Val Phe Val Pro Lys Ser Lys Ser Lys He Ser Trp Tyr Ser Lys Tyr 
160 165 170 175 

TTA AAC AAC CCG CCA GGT CGA GTT TTG ACA CTT GCT GCC ACG CTC CTC 756 
Leu Asn Asn Pro Pro Gly Arg Val Leu Thr Leu Ala Ala Thr Leu Leu 
ISO 185 190 

CTT GGC TGG CCT TTA TAT TTA GCT TTC AAT GTC TCT GGT AGA CCT TAG 804 
Leu Gly Trp Pro Leu Tyr Leu Ala Phe Asn Val Ser Giy Arg Pro Tyr 
195 200 205 

GAT CGC TTT GCT TGC CAT TAT GAT CCC TAT GGC CCA ATA TTT TCC GAA 852 
Asp Arg Phe Ala Cys His Tyr Asp Pro Tyr Gly Pro He Phe Ser Glu 
210 215 220 

AGA GAA AGG CTT CAG ATT TAC ATT GCT GAG CTC GGA ATC TTT GCC ACA 900 
Arg Glu Arg Leu Gin He Tyr He Ala Asp Leu Gly He Phe Ala Thr 
225 230 235 

ACG TTT GTG CTT TAT CAG GCT ACA ATG GCA AAA GGG TTG GCT. TGG GTA 948 
Thr Phe Val Leu Tyr Gin Ala Thr Met Ala Lys Gly Leu Ala Trp Val 
240 245 250 255 

ATG CGT ATC TAT GGG GTG CCA TTG CTT ATT GTT AAC TGT TTC CTT GTT 996 
Met Arg He Tyr Gly Val Pro Leu Leu He Val Asn Cys Phe Leu Val 
260 265 270 

ATG ATC ACA TAC TTG CAG CAC ACT CAC CCA GCT ATT CCA CGC TAT GGC 1044 
Met He Thr Tyr Leu Gin His Thr His Pro Ala He Pro Arg Tyr Gly 
275 280 285 
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TCA TCG GAA TGG GAT TGG CTC CGG GGA GCA ATG GTG ACT GTC GAT AGA 1092 
Ser Ser Glu Trp Asp Txp Leu Arg Gly Ala M^t Val Thr Val Asp Axg 

290 ; 295 300 

GAT TAT GGG GTG TTG AAT AAA GTA TTC CAT AAC ATT GCA GAC ACT CAT 1140 
Asp Tyr Gly Val Leu Asn Lys Val Phe His Asn lie Ala Asp Thr His 
305 310 315 

GTA GCT CAT CAT CTC TTT GCT ACA GTG CCA CAT TAC CAT GCA ATG GAG 1188 
Val Ala His His Leu Phe Ala Thr Val Pro His Tyr His Ala Met Glu 
320 325 330 335 

GCC ACT AAA GCA ATC AAG CCT ATA ATG GGT GAG TAT TAC CGG TAT GAT 1236 
Ala Thr Lys Ala lie Lys Pro lie Met Gly Glu Tyr Tyr Arg Tyr Asp 
340 345 350 

GGT ACC CCA TTT TAC AAG GCA TTG TGG AGG GAG GCA AAG GAG TGC TTG 1284 
Gly Thr Pro Phe Tyr Lys Ala Leu Trp Arg Glu Ala Lys Glu Cys Leu 
355 360 . 365 

TTC GTC GAG CCA GAT GAA GGA GCT CCT ACA CAA GGC GTT TTC TGG TAC 1332 
Phe Val Glu Pro Asp Glu Gly Ala Pro Thr Gin Gly Val Phe Trp Tyr 
370 375 380 

CGG AAC AAG TAT TAAAAAAGTG TCATGTAGCC TGCCG 1369 
Arg Asn Lys Tyr 
385 

(2) INFORMATION FOR SEQ ID NO: 12: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 387 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met Gly Gly Gly Gly Arg Met Ser Thr Val lie lie Ser Asn Asn Ser 
1 5 10 15 

Glu Lys Lys Gly Gly Ser Ser His Leu Glu Arg Ala Pro His Thr Lys 
20 25 30 

Pro Pro Tyr Thr Leu Gly Asn Leu Lys Arg Ala He Pro Pro His Cys 
35 40 45 

Phe Glu Arg Ser Phe Val Arg Ser Phe Ser A6n Phe Ala Tyr Asn Phe 
50 55 60 

Cys Leu Ser Phe Leu Ser Tyr Ser He Ala Thr Asn Phe Phe Pro Tyr 
65 70 75 80 
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lie Ser Ser Pro Leu Ser Tyr Val Ala Trp Leu Val Tyr Trp Leu Phe 
65 90 95 

Gin Gly Cys lie Leu Thr Gly Leu Trp Val lie Gly His Glu Cys Gly 
100 105 110 

His His Ala Phe Ser Glu Tyr Gin Leu Ala Asp Asp He Val Gly Leu 
115 120 125 

He val His Ser Ala Leu Leu Val Pro Tyr Phe Ser Trp Lys Tyr Ser 
130 135 140 

His Arg Arg His His Ser Asn He Gly Ser Leu Glu Arg Asp Glu Val 
145 150 155 160 

Phe Val Pro Lys Ser Lys Ser Lys He Ser Trp Tyr Ser Lys Tyr Leu 
165 170 175 

Asn Asn Pro Pro Gly Arg Val Leu .Thr Leu Ala Ala Thr Leu Leu Leu 
180 185 190 

Gly Trp Pro Leu Tyr Leu Ala Phe Asn Val Ser Gly Arg Pro Tyr Asp 
195 200 205 

Arg Phe Ala Cys His Tyr Asp Pro Tyr Gly Pro He Phe Ser Glu Arg 
210 215 220 

Glu Arg Leu Gin He Tyr He Ala Asp Leu Gly He Phe Ala Thr Thr 
225 230 235 240 

Phe Val Leu Tyr Gin Ala Thr Met Ala Lys Gly Leu Ala Trp Val Met 
245 250 255 

Arg He Tyr Gly Val Pro Leu Leu He Val Asn Cys Phe Leu Val Met 
260 265 270 

He Thr Tyr Leu Gin His Thr His Pro Ala He Pro Arg Tyr Gly Ser 
275 280 285 

Ser Glu Trp Asp Trp Leu Arg Gly Ala Met Val Thr Val Asp Arg Asp 
290 295 300 

Tyr Gly Val Leu Asn Lys Val Phe His Asn He Ala Asp Thr His Val 
305 310 315 320 

Ala His His Leu Phe Ala Thr Val Pro His Tyr His Ala Met Glu Ala 
325 330 335 

Thr Lys Ala He Lys Pro lie Met Gly Glu Tyr Tyr Arg Tyr Asp Gly 
340 345 350 

Thr Pro Phe Tyr Lys Ala Leu Trp Arg Glu Ala Lys Glu Cys Leu Phe 
355 360 365 

Val Glu Pro Asp Glu Gly Ala Pro Thr Gin Gly Val Phe Trp Tyr Arg 
370 375 380 
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Asn Lys Tyr 
385 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 1..23 

(D) OTHER INFORMATION: /product^ 

"synthetic 
oligonucleotide*" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

TGGGTATGCC AYGANTGYGG NCA 23 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) TRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

( iv) ANTI-SENSE : NO 

(ix) FEATURE: 

(A) NAME/KEY: misc^feature 

(B) LOCATION: 1..22 
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(D) OTHER INFORMATION: /product^ 

"synthetic 
oligonucleotide" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

AAARTGRTGG CACRTGNGTR TC 22 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2973 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii)' HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Arabidopsis thaliana 
(vii) IMMEDIATE SOURCE: 

<B) CLONE: PAGF2-6 
(ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION: 433 ..520 

(ix) FEATURE: 

(A) NAME/KEY: intron 

(B) LOCATION: 521.. 1654 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

ATTCGGTAAT TCCTACATAT TTTAGA6ATT A6TTTGAGTT TCCATCCATA CTTTACTAGT 60 

GATTATAAAT TTAAAATACG TACTTTTCGA CTATAAAGTG AAACTAAGTA AATTAGAACG 120 

TGATATTAAA AAGTTAATGT TCACTGTTAT ATTTTTTTCA CAAGTAAAAA ATGGQTTATT IBO 

TGCGGTAAAT AAAAATACCA GATATTTTGA ATTGATTAAA AAGGTTGAAA TAAGAGAGGA 240 

GGGGAAAGAA AAGAAGGTGG GGGCCCAGTA TGAAAGGGAA AGGTGTCATC AAATCATCTC 300 



wo 94/11516 



TCTCTCTCTC 


TACCTTCGAC 


CCACGGGCCG 


CCCCATCTGA 


CCACCAGAAG 


AAGAGCCACA 


AGAGHGilCAG 


AGAGAGAGAG 


AGATTCTGCG 


GTTATTAAC6 


TTATCGCCCC 


TACGTCAGCT 


TTCTTCTCAT 


TTTCGATTTT 


GATTCTTATT 


CTCCGCTCAC 


GATAGATCT6 


CTTATACTCC 


CTCTGTTTCT 


CTGTTTTTTT 


CTTTTGGTCG 


CATTAATAAT 


GATGAACTCT 


CTCATTCATA 


ATATGTT6CA 


TTTTCACTTT 


TCTTCTTTTT 


TTAGATCTTT 


ATTTTATTTT 


ATTTTCTGGT 


AAAAGCATAA 


ATTGTTATTT 


GTTAATGTAT 


ATCTGCTTCT 


ACTGTTGAAT 


CTTTCCTGGA 


AAATACATAA 


TAAAAGGAAA 


ACAAAAGTTT 


AGTTGGAATC 


AAAATAATTC 


AGGATCAGAT 


TTGCATGGAA 


AATTTTCTAG 


ATCCGTCGTC 


CTGATATATG 


ATGTCGACAA 


ATTCTGGTGG 


GCTTTGTTTG 


TCAACTXGGT 


TTTCAATACG 


ACAAGCAAAC 


TGATGTTAAC 


CACAAGCAAG 


ACTTACTACT 


A6TCGTATTC 


TCAACGCAAT 


CTCTACTCTT 


TATTCCTTTT 


GGTCCACGCA 


CT6ATTTCCC 


ACTTTGGATC 


ATTTGTCTGA 


TTGTGCATGC 


TCTGTTTTTT 


AGAATTAATG 


GCTTGTTGAT 


TCTTTTGCTT 


TTGGTTTTCT 








AACCGCCTTT 


CTCGGTGGGA 


GATCTGAAGA 


CAATCCCTCG 


CTCTTTCTCC 


TACCTTATCA 


ACGTCGCCAC 


CAATTACTTC 


TCTCTCCTCC 


TCTATTGGGC 


CTGTCAAGGC 


TGTGTCCTAA 


GTCACCACGC 


ATTCAGCGAC 


TACCAATGGC 
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TGTCCATTTA 


AAGCCCTGTC 


TCTTGCCATT 


360 


CACTCACAAA 


TTAAAAAGAG 


AGAGAGAGAG 


420 


GAGGAGCTTC 


TTCTTCGTAG 


GGTGTTCATC 


480 


CCATCTCCAG 


GTCCGTCGCT 


TCTCTTCCAT 


540 


TCTTTCCAGT 


AGCTCCTGCT 


CTGTGAATTT 


600 


TTACATTCAA 


CCTTAGATCT 


GGTCTCGATT 


660 


AGAATCTGAT 


GTTTGTTTAT 


GTTCTGTCAC 


720 


GAATGATTAG 


TTTCTCTCGT 


CTACCAAACG 


780 


TTCTAAGATG 


ATTTGCTTTG 


ACCAATTTGT 


840 


GGGTTGGTGG 


AAATTGAAAA 


AAAAAAAAAA 


900 


TCATTTTTTG 


GCTATTTGTT 


CTGGGTAAAA 


960 


TTTTTTACTC 


CTATTGGGTT 


TTTATAGTAA 


1020 


TATAGATTCT 


CTTAAACCCC 


TTACGATAAA 


1080 


GCTCTTTGAT 


TGATTCAGAT 


GCGATTACAG 


1140 


ACATTTTATT 


TTCTGTTTAA 


ATATCTAAAT 


1200 


CTTATACATC 


ACTTCAACTG 


TTTTCTTTTG 


1260 


ATTTGTGATT 


TCGATCGCTG 


AATTTTTAAT 


1320 


AGATGTGACC 


TGCCTTATTA 


ACATCGTATT 


1380 


CGTTTTTGTA 


TTTCTCACAT 


TATGCCGCTT 


1440 


TTTTCTATTT 


GTGGCAATCC 


CTTTCACAAC 


1500 


AGACTCTCTT 


GAATCGTTAC 


CACTTGTTTC 


1560 


ATAAAACTAT 


TCCATAGTCT 


TGAGTTTTCA 


1620 


GCAGAAACAT 


G6GT6CAGGT 


6GAAGAATGC 


1680 


CCGACACCAC 


AAAGCGTGTG 


CCGTGCGAGA 


1740 


AAGCAATCCC 


GCCGCATTGT 


TTCAAACGCT 


1800 


GTGACATCAT 


TATAGCCTCA 


TGCTTCTACT 


1860 


CTCAGCCTCr 


CTCTTACTTG 


GCTTGGCCAC 


1920 


CTGGTATCTG 


GGTCATAGCC 


CACGAATGCG 


1980 


TGGATGACAC 


AGTTGGTCTT 


ATCTTCCATT 


2040 
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CCTTCCTCCT 


CGTCCCTTAC 


TTCTCCTGGA 


AGTATAGTCA 


TCGCCGTCAC 


CATTCCAACA 


2100 


CTGGATCCCT 


CGAAAGAGAT 


GAAGtATTTG 


TCCCAAAGCA 


GAAATCAGCA 


ATCAAGTGGT 


2160 


ACGGGAAATA 


CCTCAACAAC 


CCTCTTGGAC 


GCATCATGAT 


GTTAACCGTC 


CAGTTTGTCC 


2220 


TCGGGTGGCC 


CTTGTACTTA 


GCCTTTAACG 


TCTCTGGCAG 


ACCGTATGAC 


GGGTTCGCTT 


2280 


GCGATTTCTT 


CCCCAACGCT 


CCCATCTACA 


ATGACC6AGA 


ACGCCTOCAG 


ATATACCTCT 


2340 


CTGATGCGG6 


TATTCTAGCC 


GTCTGTTTTG 


GTCTTTACCG 


TTACGCTGCT 


GCACAAGGGA 


2400 


T6GCCTC6AT 


GATCTGCCTC 


TACGGAGTAC 


CGCTTCTGAT 


AGTGAATGCG 


TTCCTCGTCT 


2460 


TGATCACTTA 


CTTGCAGCAC 


ACTCATCCCT 


C6TTGCCTCA 


CTACGATTCA 


TCAGAGTG6G 


2520 


ACTGGCTCAG 


GGGAGCTTTG 


GCTACCGTA6 


ACAGAGACTA 


CGGAATCTTG 


AACAAG6TGT 


2580 


TCCACAACAT 


TACAGACACA 


CACGTGGCTC 


ATCACCTGTT 


CTCGACAATG 


CCGCATTATA 


2640 








CAATTCTGGG AGACTATTAC 


CAGTTCGATG 


2700 


GAACACCGTG 


GTATGTGGCG 


ATGTATAGGG 


AGGCAAAGGA 


GTGTATCTAT. 


6TAGAACCGG 


2760 


ACA6GGAAGG 


TGACAAGAAA 


GGTGTGTACT 


GGTACAACAA 


TAAGTTATGA 


GGATGATGGT 


2820 


GAAGAAATTG 


TCGACTTTTC 


TCTTGTCTGT 


TTGTCTTTTG 


TTAAAGAAGC 


TATGCTTCGT 


2880 


TTTAATAATC 


TTATTGTCCA 


TTTTGTTGTG 


TTATGACATT 


TTGGCTGCTC 


ATTATGTTAT 


2940 


GTGGGAAGTT 


AGCGTTCAAA 


TGTTTTGGGT 


CGG 






2973 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(ix) FEATURE: 

(A) NAME/KEY : misc_f eature 

(B) LOCATION: 1..23 

(D) OTHER INFORMATION: /product= 

"synthetic 
oligonucleotide " 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
GGGCATGTNG ARAANARRTG RTG 23 



(2) INFORMATION FOR SEQ ID NO: 17: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B> TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 

(iv) anti-sense:: no 

(ix) FEATURE: 

(A) NAME/KEY : misc__f eature 

(B) LOCATION: 1, .23 

(D) OTHER INFORMATION: /product^ 

"synthetic 
oligonucleotide ** 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 



GGGCATGTRC TRAANARRTG RTG 



23 
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1. An isolated nucleic acid fragment comprising a 
nucleic acid sequence encoding a fatty acid desaturase 
or a fatty acid desaturase-related enzyme with an amino 

5 acid identity of 50% or greater to the polypeptide 
encoded by SEQ ID NOS:lr 3^ 5, 7, 9^ 11, or 15. 

2. The isolated nucleic acid fragment of Claim 1 
wherein the amino acid identity is 60% or greater to the 
polypeptide encoded by SEQ ID NOSrl, 3, 5, 1, 9, 11, or 

10 15. 

3. The isolated nucleic acid fragment of Claim 1 
wherein the nucleic acid identity is 90% or greater to 
SEQ ID NOS:i; 3, 5, 7, 9, 11, or 15. 

4. The isolated nucleic acid fragment of Claim 1 
15 wherein said fragment is isolated from an oil-producing 

plant species. 

5. An isolated nucleic acid fragment comprising a 
nucleic acid sequence encoding a delta-12 fatty acid 
hydroxylase. 

20 6. A chimeric gene capable of causing altered 

levels of ricinoleic .acid in a transformed plant cell, 
said chimeric gene comprising a nucleic acid fragment of 
Claim 5, said fragment operably linked to suitable 
regulatory sequen&es . 

25 7. A chimeric gene capable of causing altered 

levels of fatty acids in a transformed plant cellr said 
chimeric gene comprising a nucleic acid fragment of any 
of Claims 1, 2, 3, said fragment operably linked to 
suitable regulatory sequences. 

30 8. Plants containing a chimeric gene of Claim 6 

or Claim 1. 

9. Oil obtained from seeds of the plants 
containing the chimeric genes of Claim 8. 

10. A method of producing seed oil containing 

35 altered levels of unsaturated fatty acids comprising: 
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(a) transforming a plant cell of an oil- 
producing species with a chimeric gene of Claim 5; 

(b) growing fertile plants from the 
transformed plant cells of step (a) ; 

5 (c) screening progeny seeds from the fertile 

plants of step (b) for the desired levels of unsaturated 
fatty acids; and 

(d) processing the progeny seed of step (c) 
to obtain seed oil containing altered levels of 
10 \insatiirated fatty acids . 

11. A method of molecular breeding to obtain 
altered levels of a fatty acid in seed oil of oil- 
producing plant species comprising: 

(a) making a cross between two varieites of 
15 oil-producing species differing in the fatty acid trait; 

(b) making a Southern blot of restriction 
enzyme digested genomic DNA isolated from several 
progeny plants resulting from the cross of step (a) ; and 

(c) hybridizing the Southern blot with the 
20 radiolabelled nucleic acid fragment of Claim 1. 

12. A method of RFLP mapping comprising: 

(a) making a cross between two varieties of 

plants; 

(b) making a Southern . blot of restriction 
25 enzyme digested genomic DNA isolated from several 

progeny plants resulting from the cross of step (a) ; and 

(c) hybridizing the Southern blot with the 
radio l6Jt>e lied nucleic acid fragments of Claim 1 . 

13. A method to isolate nucleic acid fragments 
30 encoding fatty acid desaturases and related enzymes, 

comprising: 

(a) comparing SEQ ID N0S:2, 4, 6, 8, 10, or 
12 and other fatty acid desaturase polypeptide 
sequences; 
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(b) identifying the conserved sequences of 4 
or more amino acids obtained in step a; 

(c) desiging degenerate oligomers based on 
the conserved sequences identified in step b; and 

5 (d) using the degenerate oligomers of step c 

to isolate sequences encoding fatty acid desaturases and 
desaturase-related enzymes by sequence dependent 
protocols . 

14. An isolated nucleic acid fragment of Claim 1 
10 comprising a nucleic acid sequence encoding a plant 
microsomal delta-12 fatty acid desaturase. 
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PRODUCTXON OP HYDROXYZiATED PATTY ACIDS 
XN GENETZCAIiIiy MODZPXED PIJOITS 

TECHNICAL FIELD 

The present invention concerns the 
5 identification of nucleic acid sequences and 

constructs, and methods related thereto, and the use 
of these secfuences and constructs to produce 
genetically modified plants for the purpose of 
altering the fatty acid composition of plant oils, 
10 waxes and related compounds. 

DEFINITIONS 

The subject of this invention is a class of 
enzymes that introduce a hydroxyl group into several 
different fatty acids resulting in the production of 

15 several different kinds of hydroxylated fatty acids. 
In particular, these enzymes catalyze hydroxylation 
of oleic acid to 12 -hydroxy oleic acid and icosenoic 
acid to 14 -hydroxy icosenoic acid. Other fatty acids 
such as palmitoleic and erucic acids may also be 

20 substrates. Since it is not possible to refer to the 
enzyme by reference to a unique substrate or 
product , the enzyme is referred throughout as kappa 
hydroxylase to indicate that the enzyme introduces 
the hydroxyl three carbons distal (i.e., away from 

25 the carboxyl carbon of the acyl chain) from a double 
bond located near the center of the acyl chain. 

The following fatty acids are also the 
subject of this invention: ricinoleic acid, 12- 
hydroxyoctadec-cis-9-enoic acid (120H-18 : 1*=^*^') ; 

30 lesc^uerolic acid, 14 -hydroxy- cis-11- icosenoic acid 

(14OH-20:l=^"^") ; densipolic acid, 12 -hydroxyoctadec- 
cia-9 , 15-dienoic acid (120H-18 :2*^^''*'-") ; auricolic 
acid. T4-bydroxy-riH-ll , 17-icosadienoic acid <140H- 
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20;2*^^»^"'") ; hydroxyerucic, 16-hydroxydocos^cis-13- 
enoic acid (160H-22 : ; hydroxypalmitoleic, 12- 

hydroxyhexadec -ci a- 9 -enoic (120H-16 : l^^**') ; icosenoic 
acid (20:1*=^'*") • It will be noted that icosenoic acid 
5 is spelled eicosenoic acid in some countries. 

BACKGROUND 

Extensive surveys of the fatty acid 
composition of seed oils from different species of 
higher plants have resulted in the identification of 

10 at least 33 structurally distinct monohydroxylaced 

plant fatty acids, and 12 different polyhydroxylated 
fatty acids that are accumulated by one or more 
plant species (reviewed by van de Loo et al., 1993) . 
Ricinoleic acid, the principal constituent of the 

15 seed oil from the castor plant Riclnus communis 
(Ii.)/ is of commercial importance. The present 
inventors have cloned a gene from this species that 
encodes a fatty acid hydroxylase, and have used this 
gene to produce ricinoleic acid in transgenic plants 

20 of other species. Some of this scientific evidence 
has been published by the present inventors (van de 
liOO et al., 1995) . 

The use of the castor hydroxylase gene to 
also produce other hydroxylated fatty acids such as 

25 lesquerollc acid, densipolic acid, 

hydroxypalmitoleic, hydroxyerucic and auricolic acid 
in transgenic plants is the s\ibject of this 
invention. In addition, the identification of a gene 
encoding a homologous hydroxylase from Ifeaguerella 

30 fendlerii and the use of this gene to produce these 
hydroxylated fatty acids in transgenic plants is the 
subject of this invention. 
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Castor is a minor oilseed crop. Approximately 
50% of the seed weight is oil ( triacylglycerol) in 
which 85-90% of total fatty acids are the 
hydroxylated fatty acid, ricinoleic acid. Oil 
5 pressed or extracted from castor seeds has many 

industrial uses based upon the properties endowed by 
the hydroxylated fatty acid. The most important uses 
are production of paints and varnishes, nylon- type 
synthetic polymers, resins, lubricants, and 
10 cosmetics (Atsmon, 1989) . 

In addition to oil, the castor seed contains 
the extremely toxic protein ricin, allergenic 
proteins, and the alkaloid ricinine. These 
constituents preclude the use of the untreated seed 
15 meal (following oil extraction) as a livestock feed, 
normally an important economic aspect of oilseed 
utilization. Furthermore, with the variable nature 
of castor plants and a lack of investment in 
breeding, castor has few favorable agronomic 
20 characteristics . 

For a combination of these reasons, castor is 
no longer grown in the United States and the 
development of an alternative domestic source of 
hydroxylated fatty acids would be attractive. The 
25 production of ricinoleic acid, the important 

constituent of castor oil, in an established oilseed 
crop through genetic engineering would be a 
particularly effective means of creating a domestic 
source . 

30 Because there is no practical source of 

lesquerolic, densipolic and auricolic acids from 
plants that are adapted to modem agricultural 
practices, there is currently no large-scale use of 
these fatty acids by industry. However, the fatty 
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acids would have uses similar to those of ricinoleic 
acid if they could be produced in large quantities 
at comparable cost to other plant -derived fatty 
acids (Smith, 1985) • 
5 Plant species, such as certain species in the 

genus Lesquerella, that accumulate a high proportion 
of these fatty acids, have not been domesticated and 
are not currently considered a practical source of 
fatty acids (Hirsinger, 1989) . This invention 

10 represents a useful step toward the eventual 

production of these and other hydroxylated fatty 
acids in transgenic plants of agricultural 
importance . 

The taxonomic relationships between plants 

15 having similar or identical kinds of unusual fatty 
acids have been examined (van de Loo et al . , 1993) . 
In some cases, particular fatty acids occur mostly 
or solely in related taxa. In other cases there does 
not appear to be a direct link between taxonomic 

20 relationships and the occurrence of unusual fatty 

acids. In this respect, ricinoleic acid has now been 
identified in 12 genera from 10 families (reviewed 
in van de Ijoo et al., 1993). Thus, it appears that 
the ability to synthesize hydroxylated fatty acids 

25 has evolved several times independently during the 
radiation of the angiosperms. This suggested to us 
that the enzymes which introduce hydroxyl groups 
into fatty acids arose by minor modifications of a 
related enzyme. 

3 0 Indeed, as shown herein, the sequence 

similarity between A12 fatty acid desaturases and 
the kappa hydroxylase from castor is so high that it 
is not possible to unambiguously determine whether a 
particular enzyme is a d saturase or a hydroxylase 
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on the basis of evidence in the scientific 
literature. Similarly, a patent application (PCT WO 
94/11516) that purports to teach the isolation and 
use of A12 fatty acid desaturases does not teach how 
5 to distinguish a hydroxylase from a desaturase. In 
view of the importance of being able to distinguish 
between these activities for the purpose of genetic 
engineering of plant oils, the utility of that 
application is limited to the several instances 

10 where direct experimental evidence (e.g., altered 
fatty acid composition in transgenic plants) was 
presented to support the assignment of function. A 
method for distinguishing between fatty acid 
desaturases and fatty acid hydroxylases on the basis 

15 of amino acid sequence of the enzyme is also a 
subject of this invention. 

A feature of hydroxylated or other unusual 
fatty acids is that they are generally confined to 
seed triacylglycerols, being largely excluded from 

20 the polar lipids by unknown mechanisms (Battey and 
Ohlrogge 1989; Prasad et al., 1987). This is 
particularly intriguing since diacylglycerol is a 
precursor of both triacylglycerol and polar lipid. 
With castor microsomes , there is some evidence that 

25 the pool of ricinoleoyl -containing polar lipid is 
minimized by a preference of diacylglycerol 
acyltransf erase for ricinoleate-containing 
diacylglycerols (Bafor et al., 1991). Analyses of 
vegetative tissues have generated few reports of 

30 unusual fatty acids « other than those occurring in 
the cuticle. The cuticle contains various 
hydroxylated fatty acids which are interesterif ied 
to produce a high molecular weight polyester which 
ser^/es a structural role. A small number of other 
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exceptions exist in which unusual fatty acids are 
found in tissues other than the seed. 

The biosynthesis of ricinoleic acid from 
oleic acid in the developing endosperm of castor 
5 (RicinuB communis) has been studied by a variety of 
methods. Morris (1967) established in double- 
lcd:>eling studies that hydroxylation occurs directly 
by hydroxy 1 substitution rather than via an 
unsaturated-^ keto- or epoxy- intermediate . 

10 Hydroxylation using oleoyl-CoA as precursor can be 
demonstrated in crude preparations or microsomes, 
but activity in microsomes is unstable and variable, 
and isolation of the microsomes involved a 
considerable, or sometimes complete loss of activity 

15 (Galliard and Stumpf , 1966; Moreau and Stumpf, 
1981) . Oleic acid can replace oleoyl-CoA as a 
precursor, but only in the presence of CoA, Mg** and 
ATP (Galliard and Stumpf, 1966) indicating that 
activation to the acyl-CoA is necessary. However, no 

20 radioactivity could be detected in ricinoleoyl-CoA 
(Moreau and Stumpf, 1981) . These and more recent 
observations (Bafor et al., 1991) have been 
interpreted as evidence that the substrate for the 
castor oleate hydroxylase is oleic acid esterified 

25 to phosphatidylcholine or another phospholipid. 

The hydroxylase is sensitive to cyanide and 
azide# and dialysis against metal chelators reduces 
activity, which could be restored by addition of 
FeS04, suggesting iron involvement in enzyme activity 

30 (Galliard and Stumpf, 1966) . Ricinoleic acid 

synthesis requires molecular oxygen (Galliard and 
Stumpf, 1966; Moreau and Stumpf 1981) and requires 
NAD(P)H to reduce cytochrome b5 which is thought to 
be the intermediate electron donor for the 
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hydroxylase reaction (Smith et al., 1992). Carbon 
monoxide does not inhibit hydroxy lat ion, indicating 
that a cytochrome P450 is not involved (Galliard and 
Stumpf , 1966; Moreau and Stumpf 1981) . Data from a 
5 study of the substrate specificity of the 

hydroxylase show that all substrate parameters 
(i.e., chain length and doxible bond position with 
respect to both ends) are important; deviations in 
these parameters caused reduced activity relative to 

10 oleic acid (Howling et al . , 1972). The position at 
which the hydroxyl was introduced, however^ was 
determined by the position of the double bond, 
always being three carbons distal. Thus, the castor 
acyl hydroxylase enzyme can produce a family of 

15 different hydroxylated fatty acids depending on the 
availability of substrates. Thus, as a matter of 
convenience, the enzyme is referred throughout this 
specification as a kappa hydroxylase (rather than an 
oleate hydroxylase) to indicate the broad substrate 

20 specificity. 

The castor kappa hydroxylase has many 
superficial similarities to the microsomal fatty 
acyl desaturases (Browse and Somerville, 1991) . In 
particular, plants have a microsomal oleate 

25 desaturase active at the A12 position. The substrate 
of this enzyme (Schmidt et al., 1993) and of the 
hydroxylase (Bafor et al., 1991) appears to be a 
fatty acid esterified to the an--2 position of 
phosphatidylcholine. When oleate is the substrate, 

30 the modification occurs at the same position (A12) 
in the carbon chain, and requires the same 
cof actors, namely electrons from NADH via cytochrome 
bs and molecular oxygen. Neither enzyme is inhibited 
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by carbon monoxide (Moreau and Stumpf , 1981) , the 
characteristic inhibitor of cytochrome P450 enzymes. 

There do not appear to have been any 
published biochemical studies of the properties of 
5 the hydroxylase enzyme (s) in LGsquGrellci. 

Conceptual basis of the invention 

The present inventors have described the use 
of a cDNA clone from castor for the production of 
ricinoleic acid in transgenic plants. As noted 

10 above, biochemical studies had suggested that the 
castor hydroxylase may not have strict specificity 
for oleic acid but would also catalyze hydroxylation 
of other fatty acids such as icosenoic acid 
{20:l«***") (Howling et al,, 1972). Based on these 

15 studies, expression of kappa hydroxylase in 

transgenic plants of species such as BraBslcet napus 
and Arahldopels thalisLna that accumulate fatty acids 
such as icosenoic acid (20 : l'^^**^^) and erucic acid 
(13-docosenoic acid; 22 may cause the 

20 accumulation of hydroxylated derivatives of these 
fatty acids due to the activity of the hydroxylase 
on these fatty acids. Direct evidence is presented 
in Example 1 that hydroxlyated derivatives of 
ricinoleic, lesquerolic, densipolic and auricolic 

25 fatty acids are produced in transgenic Arabidopexs 
plants . 

Example 2 shows the isolation of a novel 
kappa hydroxylase gene from Ifeaquerella fBtidleri, 

In view of the high degree of sequence 
30 similarity between ^12 fatty acid desaturases and 
the castor hydroxylase (van de Loo et al . , 1995), 
the validity of claims (e.g., PCT WO 94/11516) for 
using a limited set of desaturase or hydroxylase 
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genes or sequences derived therefrom to identify 
genes of identical function from other species must 
be viewed with skepticism. In this application, the 
present inventors teach a method by which 
5 hydroxylase genes can be distinguished from 
desaturases. The present inventors describe a 
mechanistic basis for the similar reaction 
mechanisms of desaturases smd hydroxylases. Briefly, 
the available evidence suggests that fatty acid 

10 desaturases have a similar reaction mechanism to the 
bacterial enzyme methane monooxygenase which 
catalyses a reaction involving oxygen- atom transfer 
(CH^ -» CH3OH) (van de Loo et al., 1993} . The cof actor 
in the hydroxylase component of methane 

15 monooxygenase is termed a pt-oxo bridged diiron 

cluster (FeOFe) . The two iron atoms of the FeOFe 
cluster are liganded by protein -derived nitrogen or 
oxygen atoms, and are tightly redox- coupled by the 
covalently-bridging oxygen atom. The FeOFe cluster 

2 0 accepts two electrons, reducing it to tKe diferrous 
state, before oxygen binding. Upon oxygen binding, 
it is likely that heterolytic cleavage also occurs, 
leading to a high valent oxoiron reactive species 
that is staQ3ilized by resonance rearrangements 

2 5 possible within the tightly coupled FeOFe cluster. 

The stabilized high-valent oxoiron state of methane 
monooxygenase is capable of proton extraction from 
methane, followed by oxygen transfer, giving 
methanol . The FeOFe cof actor has been shown to be 

3 0 directly relevant to plant fatty acid modifications 

by the demonstration that castor stearoyl-ACP 
desaturase contains this type of cof actor (Fox et 
al . , 1993) . 
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On the basis of the foregoing considerations, 
the present inventors suggest that the castor oleate 
hydroxylase might be a structurally modified fatty 
acyl desaturase, based upon three arguments. The 
5 first argument involves the taxonomic distribution 
of plants containing ricinoleic acid. Ricinoleic 
acid has been found in 12 genera of 10 families of 
higher plants (reviewed in van de Loo et al . , 1993) . 
Thus, plants in which ricinoleic acid occurs are 

10 found throughout the plant kingdom, yet close 

relatives of these plants do not contain the unusual 
fatty acid. This pattern suggests that the ability 
to synthesize ricinoleic acid has arisen (and been 
lost) several times independently, and is therefore 

15 has recently diverged. In other words, the ability 
to synthesize ricinoleic acid has evolved rapidly, 
suggesting that a relatively minor genetic change in 
the structure of the ancestral enzyme was necessary 
to accomplish it. 

2 0 The second argument is that many biochemical 

properties of castor kappa hydroxylase are similar 
to those of the microsomal desaturases, as discussed 
above (e.g., both preferentially act on fatty acids 
esterified to the sn-2 position of 
25 phosphatidylcholine, both use cytochrome b5 as an 
intermediate electron donor, both are inhibited by 
cyanide, both require molecular oxygen as a 
substrate, both are thought to be located in the 
endoplasmic reticulum) . 

3 0 The third argument stems from the discussion 

of oxygenase cof actors above, in which it is 
suggested that the plant membrane bound fatty acid 
desaturases may have a ^~oxo bridged diiron cluster- 
type cof actor, and that such cof actors are capable 



11 



of catalyzing both fatty acid desaturations and 
hydroxylations, depending upon the electronic and 
structural properties of the protein active site. 

Taking these three arguments together, the 
present inventors suggest that kappa hydroxylase of 
castor endosperm is homologous to the microsomal 
oleate A12 desaturase found in all plants. A number 
of genes encoding microsomal A12 desaturases from 
various species have recently been cloned (Okuley et 
al,, 1994) and substantial information about the 
structure of these enzymes is now known (Shanklin et 
al., 1994). Hence, in the following invention, the 
present inventors teach how to use structural 
information to isolate and identify kappa 
hydroxylase genes. This example teaches the method 
by which any carbon -monoxide insensitive plant fatty 
acyl hydroxylase gene can be identified by one 
skilled in the art. 

An unpredicted outcome of our studies on the 
castor hydroxylase gene in transgenic Arabidopsis 
plants was the discovery that expression of the 
hydroxylase leads to increased accumulation of oleic 
acid in seed lipids. Because of the low nucleotide 
sequence homology between the castor hydroxylase and 
the A12- desaturase (about 67%) , it is unlikely that 
this effect is due to silencing (also called sense- 
suppression or cosuppression) of the expression of 
the desaturase gene by the hydroxylase gene. 
Whatever the basis for the effect, this invention 
teaches the use of hydroxylase genes to alter the 
level of fatty acid unsaturation in transgenic 
plants. This invention also teaches the use of 
genetically modified hydroxylase and desaturase 
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genes to achieve directed modification of fatty acid 
unsaturation levels . 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figures lA-D show the mass spectra of hydroxy 
5 fatty acids standards (Figure lA, O-TMS- 
methylricinoleate; Figure IB, O-TMS-methyl 
densipoleate ; Figure IC, O-TMS-methyl-lesqueroleate ; 
and Figure ID, O-TMS-methylauricoleate) - 

Figure 2 shows the fragmentation pattern of 
10 trimethylsilylated methyl esters of hydroxy fatty 
acids . 

Figure 3A shows the gas chromatogram of fatty 
acids extracted from seeds of wild type Arabldopsxs 
plants . Figure 3B shows the gas chromatogram of 

15 fatty acids extracted from seeds of transgenic 

Arahidopaia plants containing the fahl2 hydroxylase 
gene . The numbers indicate the following fatty 
acids: [1] 16:0; [2] 18:0; [3] 18:lci8A9; [4] 
18:2'^^-**'"; [5] 20:0; [6] 20:1'=^"^"; [7] 18 : 3*=^"^*'"'" ; 

20 [8] 20:2'=^**"'"; [9] 22:1*="*"; (lOj ricinoleic acid; 
[11] densipolic acid; [12] lesquerolic acid; and 
[13] auricolic acid. 

Figures 4A-D show the mass spectra of novel 
fatty acids found in seeds of transgenic plants. 

25 ' Figure 4A shows the mass spectrum of peak 10 from 

Figure 3B. Figure 4B shows the mass spectirum of peak 
11 from Figure 3B. Figure 4C shows the mass spectrum 
of peak 12 from Figure 3B. Figure 4D shows the mass 
spectrum of peak 13 from Figure 3B, 

30 Figure 5 shows the nucleotide sequence of 

pLesq2 (SEQ ID NO:l) . 

Figure 6 shows the nucleotide secpience of 
pL>esq3 (SEQ ID NO: 2) . 
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Figure 7 shows a Northern blot of total RNA 
from seeds of Ij. fendleri probed with pLesq2 or 
pLesqB. indicates RNA is from seeds; L, indicates 
RNA is from leaves . 

Figures 8A-B show the nucleotide sequence of 
genomic clone encoding pLesq-HYD (SEQ ID NO: 3), and 
the deduced amino acid sequence of hydroxylase 
enzyme encoded by the gene (SEQ ID NO: 4) . 

Figures 9A-B show multiple sequence alignment 
of deduced amino acid sequences for kappa 
hydroxylases and microsomal A12 desaturases. 
Abbreviations are: Rcfahl2, fahl2 hydroxylase gene 
from coxnmtmls (van de Loo et al . , 1995); Lffahl2, 
kappa hydroxylase gene from L. fendleri; Atfad2, 
fad2 desaturase from ArsJDxdopsls thetliana (Okuley et 
al . , 1994); Gmfad2-1, fad2 desaturase from Glycine 
msix (GenBank accession number L43920} ; Gmfad2-2, 
fad2 desaturase from Glycine max (Genbank accession 
number Ij43921) ; Zmfad2, fad2 desaturase from Zea 
mstys (PCT WO 94/11516) ; Rcfad2, fragment of fad2 
desaturase from R. contmunis (PCT WO 94/11516) ; 
Bnfad2, fad2 desaturase from Brassica napus (PCT WO 
94/11516); IiFFAH12.AMI, SEQ ID NO : 4 ; FAH12-AMI, SEQ 
ID NO: 5; ATFAD2.AMI, SEQ ID NO : 6 ; BNFAD2.AMI, SEQ ID 
NO:7; GMFAD2 - 1 • AMI , SEQ ID NO:8; GMFAD2 -2 . AMI , SEQ 
ID NO:9; ZMFAD2.7^I, SEQ ID NO:10; and RCFAD2.AMI, 
SEQ ID NO: 11. 

Figure 10 shows a Southern blot of genomic 
DNA from L. fendleri probed with pLesq-HYD. E » 
EcoRI, H = Hindlll, X = Xbal . 

Figure 11 shows a map of binary Ti plasmid 
pSLJr44 024 . 

Figure 12 shows a map of plasmid pYES2 . 0 
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Figure 13 shows part of a gas chromatogram of 
derivatized fatty acids from yeast cells that 
contain plasmid pLesqYes in which expression of the 
hydroxylase gene was induced by addition of 
5 galactose to the growth medium. The arrow points to 
a peak that is not present in uninduced cells. The 
lower part of the figure is the mass spectrum of the 
peak indicated by the arrow. 

SUMMARY OF THE INVENTION 

10 This invention relates to plant fatty acyl 

hydroxylases . Methods to use conserved amino acid or 
nucleotide sequences to obtain plant fatty acyl 
hydroxylases are described. Also described is the 
use of cDNA clones encoding a plant hydroxylase to 

15 produce a family of hydroxylated fatty acids in 
transgenic plants . 

In a first embodiment, this invention is 
directed to recombinant DNA constructs which can 
provide for the transcription, or transcription and 

20 translation (expression) of the plant kappa 

hydroxylase sequence. In particular, constiructs 
which are capable of transcription, or transcription 
and translation in plant host cells are preferred. 
Such constructs may contain a variety of regulatory 

25 regions including transcriptional initiation regions 
obtained from genes preferentially expressed in 
plant seed tissue. In a second aspect, this 
invention relates to the presence of such constructs 
in host cells, especially plant host cells which 

3 0 have an expressed plant kappa hydroxylase therein. 

In yet another aspect, this invention relates 
to a method for producing a plant kappa hydroxylase 
in a host cell or progeny thereof via the expression 
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of a construct in the cell. Cells containing a plant 
kappa hydroxylase as a result of the production of 
the plant kappa hydroxylase encoding 8eq[uence are 
also contemplated herein. 

In another embodiment, this invention relates 
to methods o£ using a DNA sequence encoding a plant 
kappa hydroxylase for the modification of the 
proportion of hydroxylated fatty acids produced 
within a cell, especially plant cells. Plant cells 
having such a modified hydroxylated fatty acid 
composition are also contemplated herein. 

In a further aspect of this invention, plant 
kappa hydroxylase proteins and sequences which are 
related thereto, including amino acid and nucleic 
acid seG[uences, are contemplated. Plant kappa 
hydroxylase exemplified herein includes a 
LescpjierellB. fendleri fatty acid hydroxylase. This 
exemplified fatty acid hydroxylase may be used to 
obtain other plant fatty acid hydroxylases of this 
invention - 

In a further aspect of this invention, a 
nucleic acid sequence which directs the seed 
specific expression of an associated polypeptide 
coding sequence is described. The use of this 
nucleic acid sequence or fragments derived 
therefrom, t:o obtain seed-specific expression in 
higher plants of any coding sequence is contemplated 
herein. 

In a further aspect of this invention, the 
use of genes encoding fatty acyl hydroxylases of 
this invention are used to alter the amount of fatty 
acid unsaturation of seed lipids. The present 
invention further discloses the use of genetically 
modified hydroxylase and desaturase genes to achieve 



wo 97/30582 



PCT/US97/02187 



16 



directed modification of fatty acid unsaturation 
levels . 

DETAILED DESCRIPTION OF THE INVENTION 

A genetically tiransf ormed plant of the 
5 present invention which accumulates hydroxylated 
fatty acids can be obtained by expressing the 
double- stranded DNA molecules described in this 
application . 

A plant fatty acid hydroxylase of this 
10 invention includes any sequence of amino acids, such 
as a protein, polypeptide or peptide fragment, or 
nucleic acid sequences encoding such polypeptides, 
obtainable from a plant source which demonstrates 
the ability to catalyze the production of 
15 ricinoleic, lesquerolic, hydroxyerucic (16- 
hydroxydocos- cis- 13 -enoic acid) or 

hydroxypalmit oleic (12 -hydroxy hexa dec- cis- 9-enoic) 
from CoA, ACP or lipid- linked monoenoic fatty acid 
substrates under plant enzyme reactive conditions. 

2 0 By "enzyme reactive conditions" is meant that any 

necessary conditions are available in an environment 
(i.e., such factors as temperature, pH, lack of 
inhibiting substances) which will permit the enzyme 
to function. 

25 Preferential activity of a plant fatty acid 

hydroxylase toward a particular fatty acyl substrate 
is determined upon comparison of hydroxylated fatty 
acid product amounts obtained per different fatty 
acyl substrates. For example, by "oleate preferring" 

3 0 is meant that the hydroxylase activity of the enzyme 

preparation demonstrates a preference for oleate- 
containing substrates over other substrates. 
Although the precise substrate of the castor fatty 



wo 97/30582 



PCT/US97/02187 



17 



acid hydroxylase is not known, it is thought to be a 
monounsaturated fatty acid moiety which is 
esterified to a phospholipid such as 
phosphatidylcholine. However, it is also possible 
5 that monounsaturated fatty acids esterified to 

phosphatidylethanolamine, phosphatidic acid or a 
neutral lipid such as diacylglycerol or a Coenzyme-A 
thioester may also be substrates- 

As noted above, significant activity has been 

10 observed in radioactive labelling studies using 

fatty acyl substrates other than oleate (Howling et 
al., 1972) indicating that the substrate specificity 
is for a family of related fatty acyl compounds. 
Because the castor hydroxylase introduces hydroxy 

15 groups three carbons from a double bond, proximal to 
the methyl carbon of the fatty acid, the enzyme is 
termed a kappa hydroxylase for convenience- Of 
particular interest, the present invention discloses 
that the castor kappa hydroxylase may be used for 

20 production of 12 -hydroxy- 9- octadecenoic acid 

(ricinoleate) , 12 -hydroxy- 9- hexadecenoic acid, 14- 
hydroxy-ll-eicosenoic acid, 16 -hydroxy- 13-docosenoic 
acid, 9-hydroxy-6 -octadecenoic acid by expression in 
plants species which produce the non-hydroxylated 

25 precursors. The present invention also discloses 

production of additionally modified fatty acids such 
as 12-hydroxy-9 , 15-octadecadienoic acid that result 
from desaturation cf hydroxylated fatty acids (e.g., 
12-hydroxy-9-octadecenoic acid in this example) . 

3 0 The present invention also discloses that 

future advances in the genetic engineering of plants 
will lead to production of substrate fatty acids, 
such as icosenoic acid esters, and palmitoleic acid 
esters in plants that do not normally accumulate 
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such fatty acids. The invention described herein may 
be used in conjunction with such future improvements 
to produce hydroxylated fatty acids of this 
invention in any plant species that is amenable to 
5 directed genetic modification. Thus, the 

applicability of this invention is not limited in 
our conception only to those species that currently 
accumulate suitable substrates. 

As noted above, a plant kappa hydroxylase of 

10 this invention will display activity towards various 
fatty acyl substrates. During biosynthesis of lipids 
in a plant cell, fatty acids are typically 
covalently bound to acyl carrier protein (ACP) , 
coenzyme A (CoA) or various cellular lipids. Plant 

15 kappa hydroxylases which display preferential 

activity toward lipid- linked acyl substrate are 
especially preferred because they are likely to be 
closely associated with normal pathway of storage 
lipid synthesis in immature embryos. However, 

20 activity toward acyl -CoA substrates or other 
synthetic substrates, for example, is also 
contemplated herein. 

Other plant kappa hydroxylases are obtainable 
from the specific exemplified sequences provided 

25 herein. Furthermore, it will be apparent that one 
can obtain natural and synthetic plant kappa 
hydroxylases including modified amino acid sequences 
and starting materials for synthetic -protein 
modeling from the exemplified plant kappa 

30 hydroxylase and from plant kappa hydroxylases which 
are obtained through the use of such exemplified 
sequences. Modified amino acid sequences include 
sequences which have been mutated, truncated, 
elongated or the like, whether such sec[uences were 
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partially or wholly synthesized. Sequences which are 
actually purified from plant preparations or are 
identical or encode identical proteins thereto, 
regardless of the method used to obtain the protein 
5 or sequence, are equally considered naturally 
derived. 

Thus, one skilled in the art will readily 
recognize that antibody preparations, nucleic acid 
probes (DNA and RNA) or the like may be prepared and 

10 used to screen and recover "homologous" or "related" 
kappa hydroxylases from a variety of plant sources . 
Typically, nucleic acid probes are labeled to allow 
detection, preferably with radioactivity although 
enzymes or other methods may also be used. For 

15 immunological screening methods, antibody 

preparations either monoclonal or polyclonal are 
utilized. Polyclonal antibodies, although less 
specific, typically are more useful in gene - 
isolation. For detection, the antibody is labeled 

2C using radioactivity or any one of a variety of 

second antibody /enzyme conjugate systems that are 
commercially available. 

Homologous sequences are found when there is 
an identity of sequence and may be determined upon 

2 5 comparison of sequence information, nucleic acid or 

amino acid, or through hybridization reactions 
between a known kappa hydroxylase and a candidate 
source. Conservative changes, such as Glu/Asp, 
Val/Ile, Ser/Thr, Arg/Lys and Gln/Asn may also be 

3 0 considered in determining sequence homology. 

Typically, a lengthy nucleic acid sequence may show 
as little as 50-60% sequence identity, and more 
preferably at least about 70% sequence identity, 
between the target sequence and the given plant 
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kappa hydroxylase of interest excluding any 
deletions which may be present, and still be 
considered related. Amino acid sequences are 
considered homologous by as little as 25% sequence 
5 identity between the two complete mature proteins, 
(see generally, Doolittle, R.F., OF URFS and ORFS, 
University Science Books, CA, 1986.) 

A genomic or other appropriate library 
prepared from the candidate plant source of interest 

10 may be probed with conserved sequences from the 
plant kappa hydroxylase to identify homologously 
related sequences. Use of an entire cDNA or other 
sequence may be employed if shorter probe sequences 
are not identified. Positive clones are then 

15 analyzed by restriction enzyme digestion and/or 

sequencing. When a genomic library is used, one or 
more sequences may be identified providing both the 
coding region, as well as the transcriptional 
regulatory elements of the kappa hydroxylase gene 

20 from such plant source. Probes can also be 

considerably shorter than the entire sequence. 
Oligonucleotides may be used, for example, but 
should be at least about 10, preferably at least 
about 15, more preferably at least 20 nucleotides in 

25 length. When shorter length regions are used for 

comparison, a higher degree of sequence identity is 
required than for longer sequences. Shorter probes 
are often particularly useful for polymerase chain 
reactions (PGR) , especially when highly conserved 

30 sequences can be identified (see Gould et al . , 1989 
for examples of the use of PGR to isolate homologous 
genes from taxonomical ly diverse species) . 

When longer nucleic acid fragments are 
employed (>100 bp) as probes, especially when using 
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complete or large cDNA sequences, one would screen 
with low stringencies (for example, 40-50<*C below 
the melting temperature of the probe) in order to 
obtain signal from the target sample with 20-50% 
5 deviation, i.e., homologous sequences (Beltz et al . , 
1983) . 

In a preferred embodiment, a plant kappa 
hydroxylase of this invention will have at least 60% 
overall amino acid sequence similarity with the 

10 exemplified plant kappa hydroxylase. In particular, 
kappa hydroxylases which are obtainable from an 
amino acid or nucleic acid sequence of a castor or 
liesquez-elJa kappa hydroxylase are especially 
preferred. The plant kappa hydroxylases may have 

15 preferential activity toward longer or shorter chain 
fatty acyl substrates. Plant fatty acyl hydroxylases 
having oleate- 12 -hydroxylase activity and 
eicosenoate- 14 -hydroxylase activity are both 
considered homblogously related proteins because of 

20 in vitro evidence (Howling et al - , 1972), and 

evidence disclosed herein, that the castor kappa 
hydroxylase will act on both substrates. 
Hydroxylated fatty acids may be subject to further 
enzymatic modification by other enzymes which are 

25 normally present or are introduced by genetic 

engineering methods. For example, 14 -hydroxy- 11, 17- 
eicosadienoic acid, which is present in some 
liesguejrella species (Smith, 1985) , is thought to be 
produced by desaturation of 14 -hydroxy- 11 -eicosenoic 

30 acid. 

Again, not only can gene clones and materials 
derived therefrom be used to identify homologous 
plant fatty acyl hydroxylases, but the resulting 
sequences obtained therefrom may also provide a 
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further method to obtain plant fatty acyl 
hydroxylases from other plant sources. In 
particular, PGR may be a useful technique to obtain 
related plant fatty acyl hydroxylases from sequence 
5 data provided herein. One skilled in the art will be 
able to design oligonucleotide probes based upon 
sequence comparisons or regions of typically highly 
conserved sequence. Of special interest are 
polymerase chain reaction primers based on the 

10 conserved regions of amino acid sequence between the 
castor kappa hydroxylase and the L, fendleri 
hydroxylase (SEQ ID NO:4) . Details relating to the 
design and methods for a PGR reaction using these 
probes are described more fully in the examples . 

15 It should also be noted that the fatty acyl 

hydroxylases of a variety of sources can be used to 
investigate fatty acid hydroxylation events in a 
wide variety of plant and ±n vivo applications. 
Because all plants synthesize fatty acids via a 

20 common metabolic pathway, the study and/or 

application of one plant fatty acid hydroxylase to a 
heterologous plant host may be readily achieved in a 
variety of species. 

Once the nucleic acid sequence is obtained, 

25 the transcription, or transcription and translation 
(expression) , of tihe plant fatty acyl hydroxylases 
in a host cell is desired to produce a ready source 
of the enzyme and/or modify the composition of fatty 
acids found therein in the form of free fatty acids, 

3 0 esters {particularly esterified to glycerolipids or 
as components of wax esters), estolides, or ethers. 
Other useful applications may be found when the host 
cell is a plant host cell, in vitro and in vivo. For 
example, by increasing the amount of an kappa 



wo 97/30582 



PCT/US97A>2187 



23 



hydroxylase available to the plant, an increased 
percentage of ricinoleate or lesqueroleate (14- 
hydroxy-ll-eicosenoic acid) may be provided. 

Ka ppa Hydroxylase 
5 By this invention, a mechanism for the 

biosynthesis of ricinoleic acid in plants is 
demonstrated. Namely, that a specific plant kappa 
hydroxylase having preferential activity toward 
fatty acyl substrates is involved in the 

10 accumulation of hydroxylated fatty acids in at least 
some plant species. The use of the terms ricinoleate 
or ricinoleic acid (or lesqueroleate or lesquerolic 
acid, densipoleate etc.) is intended to include the 
free acids, the ACP and CoA esters, the salts of 

15 these acids, the glycerolipid esters (particularly 
the triacylglycerol esters) , the wax esters, the 
estolides and the ether derivatives of these acids. 

The determination that plant fatty acyl 
hydroxylases are active in the in vivo production of 

2 0 hydroxylated fatty acids suggests several 

possibilities for plant enzyme sources. In fact, 
hydroxylated fatty acids are found in some natural 
plant species in abundance. For example, three 
hydroxy fatty acids related to ricinoleate occur in 
25 major amounts in seed oils from various Lesqruereila 
species. Of particular interest, lesquerolic acid is 
a 2 0 carbon homolog of ricinoleate with two 
additional carbons at the carboxyl end of the chain 
(Smith, 1985) . Other natural plant sources of 

3 0 hydroxylated fatty acids include but are not limited 

to seeds of the Linum genus, seeds of Wrightla, 
species, Lycopodxum species, Strophanthus species. 
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Convolvul^ces species^ Calendula species and many 
others (van de Loo et al . , 1993) . 

Plants having significant presence of 
ricinoleate or lesqueroleate or desaturated other or 
5 modified derivatives of these fatty acids are 

preferred candidates to obtain naturally- derived 
kappa hydroxylases. For example, liesgfuereJia 
densipila contains a diunsaturated 18 carbon fatty 
acid with a hydroxyl group {van de Loo et al., 1993) 

10 that is thought to be produced by an enzyme that is 
closely related to the castor kappa hydroxylase, 
according to the theory on which this invention is 
based. In addition, a comparison between kappa 
hydroxylases and between plant fatty acyl 

15 hydroxylases which introduce hydroxyl groups at 

positions other than the 12 -carbon of oleate or the 
14 -carbon of lesqueroleate or on substrates other 
than oleic acid and icosenoic acid may yield 
insights for gene identification, protein modeling 

20 or other modifications as discussed above. 

Especially of interest are fatty acyl 
hydroxylases which demonstrate activity toward fatty 
acyl substrates other than oleate, or which 
introduce the hydroxyl group at a location other 

25 than the C12 carbon. As described above, other plant 
sources may also provide sources for these enzymes 
through the use of protein purification, nucleic 
acid probes, antibody preparations, protein 
modeling, or sequence comparisons, for example, and 

3 0 of special interest are the respective amino acid 
and nucleic acid sequences corresponding to such 
plant fatty acyl hydroxylases. Also, as previously 
described, once a nucleic acid sequence is obtained 
for the given plant hydroxylase, further plant 
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sequences may be compared and/or probed to obtain 
homologously related DNA sequences thereto and so 
on - 

Genetic En gineering Applications 
5 As is well known in the art, once a cDNA 

clone encoding a plant kappa hydroxylase is 
obtained, it may be used to obtain its corresponding 
genomic nucleic acid sequences thereto. 

The nucleic acid sequences which encode plant 

10 kappa hydroxylases may be used in various 

constructs, for example, as probes to obtain further 
sequences from the same or other species . 
Alternatively, these sequences may be used in 
conjunction with appropriate regulatory sequences to 

15 increase levels of the respective hydroxylase of 
interest in a host cell for the production of 
hydroxylated fatty acids or study of the enzyme in 
vitro or in vivo or to decrease or increase levels 
of the respective hydroxylase of interest for some 

2 0 applications when the host cell is a plant entity, 
including plant cells, plant parts (including but 
not limited to seeds, cuttings or tissues) and 
plants . 

A nucleic acid sequence encoding a plant 
25 kappa hydroxylase of this invention may include 
genomic, cDNA or mRNA sequence. By "encoding" is 
meant that the sequence corresponds to a particular 
amino acid sequence either in a sense or ant i- sense 
orientation. By "recombinant" is meant that the 
30 sequence contains a genetically engineered 

modification through manipulation via mutagenesis, 
restriction enzymes, or the like. A cDNA sequence 
may or may not encode pre-processing sequences, such 
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as transit or signal peptide sequences. Transit or 
signal peptide sequences facilitate the delivery of 
the protein to a given organelle and are frequently 
cleaved from the polypeptide upon entry into the 
5 organelle, releasing the "mature" sequence. The use 
of the precursor DNA sequence is preferred in plant 
cell expression cassettes. 

Furthermore, as discussed above the complete 
genomic sequence of the plant kappa hydroxylase may 
10 be obtained by the screening of a genomic library 
with a probe, such as a cDNA probe, and isolating 
those sequences which regulate expression in seed 
tissue . 

Once the desired plant kappa hydroxylase 

15 nucleic acid sequence is obtained, it may be 

manipulated in a variety of ways. Where the sequence 
involves non- coding flanking regions, the flanking 
regions may be subjected to resection, mutagenesis, 
etc. Thus, transitions, transversions , deletions, 

2 0 and insertions may be performed on the naturally 

occurring sequence. In addition, all or part of the 
sequence may be synthesized. In the structural gene, 
one or more codons may be modified to provide for a 
modified amino acid sequence, or one or more codon 

25 mutations may be introduced to provide for a 
convenient restriction site or other purpose 
involved with construction or expression. The 
structural gene may be further modified by employing 
synthetic adapters, linkers to introduce one or more 

30 convenient restriction sites, or the like. 

The nucleic acid or amino acid secjuences 
encoding a plant kappa hydroxylase of this invention 
may be combined with other non-native, or 
"heterologous", sequences in a variety of ways. By 
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"heterologous" sequences is meant any sequence which 
is not naturally found joined to the plant kappa 
hydroxylase, including, for example, combination of 
nucleic acid sequences from the same plant which are 
5 not naturally found joined together. 

The DNA seq[uence encoding a plant kappa 
hydroxylase of this invention may be employed in 
conjunction with all or part of the gene sequences 
normally associated with the kappa hydroxylase. In 

10 its component parts, a DNA sequence encoding kappa 
hydroxylase is combined in a DNA construct having, 
in the 5' to 3 ' direction of transcription, a 
transcription initiation control region capable of 
promoting transcription and/or translation in a host 

15 cell, the DNA sequence encoding plant kappa 

hydroxylase and a transcription and/or translation 
termination region . 

Potential host cells include both prokaryotic 
and eukaryotic cells. A host cell may be unicellular 

20 or found in a multicellular differentiated or 
undifferentiated organism depending upon the 
intended use. Cells of this invention may be 
distinguished by having a plant kappa hydroxylase 
foreign to the wild- type cell present therein, for 

25 example, by having a recombinant nucleic acid 
construct encoding a plant kappa hydroxylase 
therein . 

Depending upon the host, the regulatory 
regions will vary, including regions from viral, 
3 0 plasmid or chromosomal genes, or the like. For 
expression in prokaryotic or eukaryotic 
microorganisms, particularly unicellular hosts, a 
wide variety of constitutive or regulatable 
promoters may be employed. Expression in a 
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microorganism can provide a ready source of the 
plant enzyme. Among transcriptional initiation 
regions which have been described are regions from 
bacterial and yeast hosts, such as £. coli, B. 
5 suJbtilis, Saccharomyces cerevisiaG, including genes 
such as beta-galactosidase, T7 polymerase, trpE or 
the like. 

For the most part, the constructs will 
involve regulatory regions functional in plants 

10 which provide for modified production of plant kappa 
hydroxylase with resulting modification of the fatty 
acid composition. The open reading frame, coding for 
the plant kappa hydroxylase or functional fragment 
thereof will be joined at its 5' end to a 

15 transcription initiation regulatory region. Numerous 
transcription initiation regions are available which 
provide for a wide variety of constitutive or 
regulatable, e.g., inducible, transcription of the 
structural gene functions. 

20 Among transcriptional initiation regions used 

for plants are such regions associated with the 
structural genes such as for nopaline and mannopine 
synthases, or with napin, soybean ^-conglycinin, 
oleosin, 12S storage protein, the cauliflower mosaic 

25 virus 35S promoters or the like. The transcription/ 
translation initiation regions corresponding to such 
structural genes are found immediately 5' upstream 
to the respective start codons , 

In embodiments wherein the expression of the 

3 0 kappa hydroxylase protein is desired in a plant 

host, the use of all or part of the complete plant 
kappa hydroxylase gene is desired. If a different 
promoter is desired, such as a promoter native to 
the plant host of interest or a modified promoter. 
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i.e., having transcription initiation regions 
derived from one gene source and translation 
initiation regions derived from a different gene 
source or enhanced promoters, such as double 35S 
5 CaMV promoters, the sequences may be joined together 
using standard techniques. 

For such applications when 5' upstream non- 
coding regions are obtained from other genes 
regulated during seed maturation, those 

10 preferentially expressed in plant embryo tissue, 
such as transcription initiation control regions 
from the B. napus napin gene, or the Arabidopsis 12S 
storage protein, or soybean /3-conglycinin (Bray et 
al-, 1987) are desired. Transcription initiation 

15 regions which are preferentially expressed in seed 
tissue, i.e., which are undetectable in other plant 
parts, are considered desirable for fatty acid 
modifications in order to minimize any disruptive or 
adverse effects of the gene product. 

2 0 Regulatory transcript termination regions may 

be provided in DNA constructs of this invention as 
well. Transcript termination regions may be provided 
by the DNA sequence encoding the plant kappa 
hydroxylase or a convenient transcription 

25 termination region derived from a different gene 
source, for example, the transcript termination 
region which is naturally associated with the 
transcript initiation region. Where the transcript 
termination region is from a different gene source, 

30 it will contain at least about 0.5 kb, preferably 
about 1-3 kb of sequence 3' to the structural gene 
from which the termination region is derived. 

Plant expression or transcription constructs 
having a plant kappa hydroxylase as the DNA sequence 
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of interest for increased or decreased expression 
thereof may be employed with a wide variety of plant 
life, particularly, plant life involved in the 
production of vegetable oils for edible and 
5 industrial uses. Most especially preferred are 

temperate oilseed crops. Plants of interest include, 
but are not limited to rapeseed (canola and high 
erucic acid varieties) , Crsunbe, Brassica juncoa, 
Brassica nigra, meadowfoam, flax, sunflower, 

10 saf flower, cotton, Cuphea, soybean, peanut, coconut 
and oil palms and corn. An important criterion in 
the selection of suitable plants for the 
introduction on the kappa hydroxylase is the 
presence in the host plant of a suitable substrate 

15 for the hydroxylase. Thus, for example, production 
of ricinoleic acid will be best accomplished in 
plants that normally have high levels of oleic acid 
in seed lipids. Similarly, production of lesquerolic 
acid will best be accomplished in plants that have 

20 high levels of icosenoic acid in seed lipids. 

Depending on the method for introducing the 
recombinant constructs into the host cell, other DNA 
sequences may be required. Importantly, this 
invention is applicable to dicotyledons and 

25 monocotyledons species alike and will be readily 

applicable to new and/or improved transformation and 
regulation techniques. The method of transformation 
is not critical to the current invention; various 
methods of plant transformation are currently 

30 available* As newer methods are available to 
transform crops, they may be directly applied 
hereunder. For example, many plant species naturally 
susceptible to Agrobac cerium infection may be 
successfully transformed via tripartite or binary 
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vector methods of Ag^rohacterium mediated 
transformation. In addition, techniques of 
microinjection, DNA particle bombardment, 
electroporation have been developed which allow for 
5 the transformation of various monocot and dicot 
plant species. 

In developing the DNA construct, the various 
components of the construct or fragments thereof 
will normally be inserted into a convenient cloning 

10 vector which is capable of replication in a 

bacterial host, e.g., E. coli. Numerous vectors 
exist that have been described in the literature. 
After each cloning, the plasmid may be isolated and 
subjected to further manipulation, such as 

15 restriction, insertion of new fragments, ligation, 
deletion, insertion, resection, etc., so as to 
tailor the components of the desired sequence. Once 
the construct has been completed, it may then be 
transferred to an appropriate vector for further 

20 manipulation in accordance with the manner of 
transformation of the host cell. 

Normally, included with the DNA construct 
will be a structural gene having the necessary 
regulatory regions for expression in a host and 

25 providing for selection of transformant cells. The 
gene may provide for resistance to a cytotoxic 
agent, e.g., antibiotic, heavy metal, toxin, etc., 
complementation providing prototropy to an 
auxotrophic host, viral immunity or the like. 

30 Depending upon the number of different host species 
the expression construct or components thereof are 
introduced, one or more markers may be employed, 
where different conditions for selection are used 
for the different hosts. 
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It is noted that the degeneracy of the DNA 
code provides that some codon substitutions are 
permissible of DNA sequences without any 
corresponding modification of the amino acid 
5 sequence. 

As mentioned above, the manner in which the 
DNA construct is introduced into the plant host is 
not critical to this invention. Any method which 
provides for efficient transformation may be 

10 employed. Various methods for plant cell 

transformation include the use of Ti- or Ri- 
plasmids , microinjection, electroporation , 
infiltration, imbibition, DNA particle bombardment, 
liposome fusion, DNA bombardment or the like. In 

15 many instances, it will be desirable to have the 
construct bordered on one or both sides of the T- 
DNA, particularly having the left and right borders, 
more particularly the right border. This is 
particularly useful when the construct uses A. 

20 tumefaciens or A. rhlzogenes as a mode for 

transformation, although the T-DNA borders may find 
use with other modes of transformation. 

Where Agrobactzerium is used for plant cell 
transformation, a vector may be used which may be 

2 5 introduced into the Agrohacterium host for 

homologous recombination with T-DNA or the Ti- or 
Ri-plasmid present in the AgroJbac cerium host. The 
Ti- or Ri-plasmid containing the T-DNA for 
recombination may be armed (capable of causing gall 

30 formation) or disarmed (incapable of causing gall) , 
the latter being permissible, so long as the vir 
genes are present in the transformed AgroJbacterium 
host . The armed plasnid can give a mixture of norroal 
plant cells and gall . 
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In some instances where Agrobacteriuw is used 
as the vehicle for transforming plant cells, the 
expression construct bordered by the T-DNA border (s) 
will be inserted into a broad host spectrum vector, 
5 there being broad host spectrum vectors described in 
the literature. Commonly used is pRK2 or derivatives 
thereof. See, for example, Ditta et al . (1980), 
which is incorporated herein by reference . Included 
with the expression construct and the T-DNA will be 

10 one or more markers, which allow for selection of 
transformed AgroJbacterium and transformed plant 
cells. A number of markers have been developed for 
use with plant cells, such as resistance to 
kanamycin, the aminoglycoside G418, hygromycin, or 

15 the like. The particular marker employed is not 

essential to this invention, one or another marker 
being preferred depending on the particular host and 
the manner of construction. 

For transformation of plant cells using 

20 Ag'roJbacteriuxn, explants may be combined and 

incubated with the transformed AgrroJbacteriuni for 
sufficient time for transformation, the bacteria 
killed, and the plant cells cultured in an 
appropriate selective medium. Once callus forms, 

25 shoot formation can be encouraged by employing the 
appropriate plant hormones in accordance with known 
methods and the shoots transferred to rooting medium 
for regeneration of plants. The plants may then be 
grown to seed and the seed used to establish 

3 0 repetitive generations and for isolation of 
vegetable oils. 
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Using Hydroxylase Genes to Alter the Activity of 
Fatty Acid Desaturases 

A widely acknowledged goal of current efforts 
to improye the nutritional quality of edible plant 
5 oils, or to facilitate industrial applications of 

plant oils, is to alter the level of desaturation of 
plant storage lipids (Topfer et al., 1995). In 
particular, in many crop species it is considered 
desirable to reduce the level of polyunsaturation of 

10 storage lipids and to increase the level of oleic 

acid- The precise amount of the various fatty acids 
in a particular plant oil varies with the intended 
application. Thus, it is desirable to have a robust 
method that will permit genetic manipulation of the 

15 level of unsaturation to any desired level. 

Substantial progress has recently been made 
in the isolation of genes encoding plant fatty acid 
desaturases (reviewed in Topfer et al . , 1995). These 
genes have been introduced into various plant 

20 species and used to alter the level of fatty acid 

unsaturation in one of three ways. First, the genes 
can be placed under transcriptional control of a 
strong promoter so that the amount of the 
corresponding enzyme is increased. In some cases 

25 this leads to an increase in the amount of the fatty 
acid that is the product of the reaction catalyzed 
by the enzyme. For example, Arondel et al . (1992) 
increased the amount of linolenic acid (18:3) in 
tissues of transgenic Araihidopsis plants by placing 

30 the endoplasmic reticulum- localized fad3 gene under 
transcriptional control of the strong constitutive 
cauliflower mosaic virus 35S promoter. 

A second method of using cloned genes to 
alter the level of fatty acid unsaturation is to 
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cause transcription of all or part of a gene in 
transgenic tissues so that the transcripts have an 
antisense orientation relative to the normal mode of 
transcription- This has been used by a number of 
5 laboratories to reduce the level of expression of 
one or more desaturase genes that have significant 
nucleotide sequence homology to the gene used in the 
construction of the antisense gene (reviewed in 
Topfer et al . ) . For instance, antisense repression 

10 of the oleate A12 -desaturase in transgenic rapeseed 
resulted in a strong increase in oleic acid content 
(cf., Topfer et al . , 1995). 

A third method for using cloned genes to 
alter fatty acid desaturation is to exploit the 

15 phenomenon of cosuppression or "gene -silencing" 
(Matzke et al . , 1995). Although the mechanisms 
responsible for gene silencing are not known in any 
details it has frequently been observed that in 
transgenic plants, expression of an introduced gene 

2 0 leads to inactivation of homologous endogenous 

genes . 

For example, high-level sense expression of 
the AraJbidopsls fadS gene, which encodes a 
chloroplast - local i zed A15 - desaturase , in transgenic 
25 Arahidopsis plants caused suppression of the 

endogenous copy of the fadS gene and the homologous 
fad? gene (which encodes an isozyme of the fadS 
gene) (Gibson et al., 1994). The fad7 and fadS genes 
are only 76% identical at the nucleotide level. At 

3 0 the time of publication, this example represented 

the most divergent pair of plant genes for which 
cosuppression had been observed. 

In view of previous evidence concerning the 
relatively high 3 eve * nucleotide sequence 
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homology recjuired to obtain cosuppression, it is not 
obvious to one skilled in the art that sense 
expression in transgenic plants of the castor fatty 
acyl hydroxylase of this invention would 
5 significantly alter the amount of unsaturation of 
storage lipids. 

However, the present inventors establish that 
fatty acyl hydroxylase genes can be used for this 
purpose as taught in Example 4 of this 

10 specification- Of particular importance, this 

invention teaches the use of fatty acyl hydroxylase 
genes to increase the proportion of oleic acid in 
transgenic plant tissues. The mechanism by which 
expression of the gene exerts this effect is not 

15 known but may be due to one of several possibilities 
which are elaborated upon in Example 4 . 

The invention now being generally described, 
it will be more readily understood by reference to 
the following examples which are included for 

2 0 purposes of illustration only and are not intended 
to limit the present invention. 



EXAMPLES 

In the experimental disclosure which follows, 
all temperatures are given in degrees centigrade 

25 (^'O, weights are given in grams (g) , milligram (mg) 
or micrograms (m9) / concentrations are given as 
molar (M) , millimolar (mM) or micromolar (/iM) and 
all volumes are given in liters (1) , microliters 
(/Ltl) or milliliters (ml), unless otherwise 

3 0 indicated- 
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EXAMPLE 1 - PRODUCTION OF NOVEL HYDROXYLATED FATTY 

ACIDS IN ARABIDOPSIS THALIANA 

Overview 

The kappa hydroxylase encoded by the fahl2 
5 gene from castor was used to produce ricinoleic 
acid, lesquerolic acid, densipolic acid and 
auricolic acid in transgenic AraJbidopsis plants. 

Production of tiiransae nic plants 

A variety of methods have been developed to 

10 insert a DNA sequence of interest into the genome of 
a plant host to obtain the transcription and 
translation of the sequence to effect phenotypic 
changes. The following methods represent only one of 
many equivalent means of producing transgenic plants 

15 and causing expression of the hydroxylase gene. 

Arabidopsls plants were transformed, by 
AgroJbacteriu/n- mediated transformation, with the 
kappa hydroxylase encoded by the castor fahl2 gene 
on binary Ti plasmid pB6 . This plasmid has also been 

20 used to transform NicoticLna taJbacum for the 
production of ricinoleic acid. 

Inoculums of AgroJbacteriujn tume^aciens strain 
GV3101 containing binary Ti plasmid pB6 were plated 
on L-broth plates containing 50 ptg/ml kanamycin and 

25 incubated for 2 days at 30<»C. Single colonies were 
used to inoculate large licpaid cultures (L-broth 
medium with 50 mg/1 rifampicin, 110 mg/1 gentamycin 
and 200 mg/1 kanamycin) to be used for the 
transformation of Arabidopsls plants. 

30 AraJbidopsis plants were transformed by the in 

planta. transformation procedure essentially as 
described by Bechtold ec al . (1993). Cells of A. 
cumetsiciens GV3iai(pB6) were harvested from liquid 
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cultures by centrif ugation, then resuspended in 
infiltration medium at ODgoo = 0.8. Infiltration 
medium was Murashige and Skoog macro and 
micronutrient medium {Sigma Chemical Co., St. Louis, 
5 MO) containing 10 mg/1 6-ben2ylaminopurine and 5% 

glucose. Batches of 12-15 plants were grown for 3 to 
4 weeks in natural light at a mean daily temperature 
of approximately 25®C in 3 . 5 inch pots containing 
soil. The intact plants were immersed in the 

10 bacterial suspension then transferred to a vacuum 
chamber and placed under 600 mm of vacuum produced 
by a laboratory vacuum pump until tissues appeared 
uniformly water-soaked (approximately 10 min) . The 
planes were grown at 25 ®C under continuous light 

15 (100 /imol m*^ s'^ irradiation in the 400 to 700 nm 

range] for four weeks. The seeds obtained from all 
the plants in a pot were harvested as one batch. The 
seeds were sterilized by sequential treatment for 2 
min with ethanol followed by 10 min in a mixture of 

20 household bleach (Chlorox) , water and Tween-80 (50%, 
50%, 0.05%) then rinsed thoroughly with sterile 
water. The seeds were plated at high density (2000 
to 4000 per plate) onto agar-solidif ied medium in 
100 mm petri plates containing 1/2 X Murashige and 

25 Skoog salts medium enriched with B5 vitamins (Sigma 
Chemical Co., St. Louis, MO) and containing 
kanamycin at 50 mg/1. After incubation for 48 h at 
4*^C to stimulate germination, seedlings were grown 
for a period of seven days until transf ormants were 

30 clearly identifiable as healthy green seedlings 
against a background of chlorotic kanamycin- 
sensitive seedlings. The transf ormants were 
transferred to soil for two weeks before leaf tissue 
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could be used for DNA and lipid analysis. More than 
2 0 transf ormants were obtained. 

DNA was extracted from young leaves from 
transf ormants to verify the presence of an intact 
5 fahl2 gene. The presence of the transgene in a 

number of the putative transgenic lines was verified 
by using the polymerase chain reaction to amplify 
the insert from pB6 . The primers used were HF2 = 
GCTCTTTTGTGCGCTCATTC (SEQ ID NO: 12) and HRl = 

10 CGGTACCAGAAAACGCCTTG (SEQ ID NO: 13), which were 
designed to allow the amplification of a 700 bp 
fragment. Approximately 100 ng of genomic DNA was 
added to a solution containing 25 pmol of each 
primer, 1.5 U Taq polymerase (Boehringer Manheim) , 

15 200 uM of dNTPs, 50 mM KCl , 10 mM Tris.Cl (pH 9), 
0.1% (v/v) Triton X-100, 1 . 5 mM MgCla, 3% (v/v) 
formamide, to a final volume of 50 /xl . 
Amplifications conditions were: 4 min denaturation 
step at 94<>C, followed by 30 cycles of 92*0 for 1 

20 min, 55*»C for 1 min, 72«C for 2 min. A final 

extension step closed the program at 72 ®C for 5 min. 
Transf ormants could be positively identified after 
visualization of a characteristic 1 kb amplified 
fragment on an ethidium bromide stained agarose gel. 

25 All transgenic lines tested gave a PGR product of a 
size consistent with the expected genotype, 
confirming that the lines were, indeed, transgenic. 
All further experiments were done with three 
representative transgenic lines of the wild type 

30 designated as 1-3, 4D. 7-4 and one transgenic line 
of the fad2 mutant line JB12 . The transgenic JB12 
line was included in order to test whether the 
increased accumulation of oleic acid in this mutant 
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would have ciii effect on tihe dmounti of 3ri.cd.noXei.c 
acid that accumulated in the transgenic plants. 

Analysis of transgenic plants 

Leaves and seeds from fahl2 transgenic 
5 Arabidapsis plants were analyzed for the presence of 
hydroxylated fatty acids using gas chromatography. 
Lipids were extracted from 100-200 mg leaf tissue or 
50 seeds. Fatty acid methyl esters (FAMES) were 
prepared by placing tissue in 1.5 ml of 1.0 M 

10 methanolic HCl (Supelco Co.) in a 13 x 100 mm glass 
screw-cap tube capped with a teflon- lined cap and 
heated to 80°C for 2 hours. Upon cooling, 1 ml 
petroleum ether was added and the FAMES removed by 
aspirating off the ether phase which was then dried 

15 under a nitrogen stream in a glass tube. One hundred 
fjLl of N, O-bis (Trimethylsilyl) trif luoroacetamide 
(BSTFA; Pierce Chemical Co) and 200 ^1 acetonitrile 
was added to derivatize the hydroxyl groups. The 
reaction was carried out at 70**C for 15 min. The 

2 0 products were dried under nitrogen, redissolved in 
100 ^1 chloroform and transferred to a gas 
chromatograph vial . Two fil of each sample were 
analyzed on a SP2 34 0 fused silica capillary column 
(30 m, 0.75 mm ID, 0.20 mm film, Supelco), using a 

25 Hewlett-Packard 5890 II series Gas Chromatograph. 

The samples were not split, the temperature program 
was 195®C for 18 min, increased to 230°C at 
25°C/min, held at 230^C for 5 min then down to 195<*C 
at 25«»C/min., and flame ionization detectors were 

30 used. 

The chromatographic elution time of methyl 
esters and O-TMS deri/atives of ricinoleic acid, 
lesquerolic acid and ^uricolic acid was established 
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by GC-MS of lipid samples from seeds of L. fendlerl 
and comparison to published chromatograms of fatty 
acids from this species (Carlson et al . , 1990) . A O- 
TMS-methyl-ricinoleate standard was prepared from 
5 ricinoleic acid obtained from Sigma Chemical Co (St, 
Louis, MO) . O-TMS-methyl-lesqueroleate and O-TMS- 
methyl-auricoleate standards were prepared from 
triacylglycerols purified from seeds of L. fendlerl , 
The mass spectrum of O-TMS-methyl - ricinoleate , O- 

10 TMS -methyl -densipoleate , O-TMS-methyl-lesqueroleate, 
and O-TMS-methyl-auricoleate are shown in Figures 
lA-D, respectively. The structures of the 
characteristic ions produced during mass 
spectrometry of these derivatives are shown in 

15 Figure 2 . 

Lipid extracted from transgenic tissues were 
analyzed by gas chromatography and mass spectrometry 
for the presence of hydroxylated fatty acids. As a 
matter of reference, the average fatty acid 

20 composition of leaves in Aretbidopsls wild type and 
fad2 mutant lines was reported by Miquel and Browse 
(1992) . Gas chromatograms of methylated and 
silylated fatty acids from seeds of wild type and a 
fahl2 transgenic wild type plant are shown in 

25 Figures 3A and 3B, respectively. The profiles are 

very similar except for the presence of three small 
but distinct peaks at 14.3, 15.9 and 18.9 minutes. A 
very small peak at 20.15 min was also evident. The 
elution time of the peaks at 14.3 and 18.9 min 

30 corresponded precisely to that of comparably 

prepared ricinoleic and lesquerolic standards, 
respectively. No significant differences were 
observed in lipid extracts from leaves or roots of 
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the wild type and the fahl2 transgenic wild type 
lines (Table 1) . 

Thus, in spite of the fact that the fahl2 
gene is expressed throughout the plant, effects on 
5 fatty acid composition was observed only in seed 
tissue. The present inventors have made a similar 
observation for transgenic fahl2 tobacco. 

Table 1. Fatty acid composition of lipids from 
transgenic and wild type Arabidopsis . The values are 
10 the means obtained from analysis of samples from 
three independent transgenic lines, or three 
independent samples of wild type and fad2 lines. 
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In order to confirm that the observed new 
peaks in the transgenic lines corresponded to 
derivatives of ricinoleic, lesquerolic, densipolic 
and auricolic acids, mass spectrometry was used. The 
5 fatty acid derivatives were resolved by gas 

chromatography as described above except that a 
Hewlett-Packard 5971 series mass selective detector 
was used in place of the flame ionization detector 
used in the previous experiment . The spectra of the 

10 four new peaks in Figure 3B {peak numbers 10, 11, 12 
and 13) are shown in Figures 4A-D, respectively. 
Comparison of the spectrum obtained for the 
standards with that obtained for the four peaks from 
the transgenic lines confirms the identity of the 

15 four new peaks. On the basis of the three 

characteristic peaks at M/Z 187, 270 and 299, peak 
10 is unambiguously identified as O-TMS- 
methylricinoleate . On the basis of the three 
characteristic peaks at M/2 185, 270 and 299, peak 

20 11 is unambiguously identified as O-TMS- 

methyldensipoleate . On the basis of the three 
characteristic peaks at M/Z 187, 298 and 327, peak 

12 is unambiguously identified as O-TMS- 
methyllesqueroleate . On the basis of the three 

25 characteristic peaks at M/Z 185, 298 and 327, peak 

13 is unambiguously identified as O-TMS- 
methylauricoleate . 

These results unequivocally demonstrate the 
identity of the fahl2 cDNA as encoding a hydroxylase 
3 0 that hydroxylates both oleic acid to produce 

ricinoleic acid and also hydroxylates icosenoic acid 
to produce lesquerolic acid. These results also 
provide additional evidence that the hydroxylase can 
be functionally expressed in a heterologous plant 
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species in such a way that the enzyme is 
catalytically functional. These results also 
demonstrate that expression of this hydroxylase gene 
leads to accumulation of ricinoleic, lesquerolic, 
5 densipolic and auricolic acids in a plant species 

that does not normally accumulate hydroxylated fatty 
acids in extractable lipids. 

The present inventors expected to find 
lesquerolic acid in the transgenic plants based on 

10 the biochemical evidence suggesting broad substrate 
specificity of the kappa hydroxylase. By contrast, 
the accumulation of densipolic and auricolic acids 
was less predictable . Since Arabidopsis does not 
normally contain significant quantities of the non- 

15 hydroxylated precursors of these fatty acids which 
could serve as substrates for the hydroxylase, it 
appears that one or more of the three n-3 fatty acid 
desaturases known in Arabidopsis (e.g., f ad3 , f ad7 , 
fad8; reviewed in Gibson et al., 1995) are capable 

20 of desaturating the hydroxylated compounds at the n- 
3 position. That is, densipolic acid is produced by 
the action of an n-3 desaturase on ricinoleic acid. 
Auricolic acid is produced by the action of an n-3 
desaturase on lesquerolic acid. Because it is 

25 located in the endoplasmic reticulum, the fad3 

desaturase is almost certainly responsible. This can 
be tested in the future by producing fahl2- 
containing transgenic plants of the fad3 -deficient 
mutant of ArAbidapsis (similar experiments can be 

30 done with fad? and fad8) . It is also formally 

possible that the enzymes that normally elongate 
3^Q.-j^cisA9 20 il''^^^^^ may elongate 120H-18 : 1^"^^ to 
14OH-20 : 1^^*^", and 120H- 18 : 2^^^^^^'" to 140H- 2 0 : 2^^°^^^' . 
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The amount of the various fatty acids in 
seed, leaf and root lipids of the control and 
transgenic plants is also presented in Table 1 . 
Although the amount of hydroxylated fatty acids 
5 produced in this example is less than desired for 
production of ricinoleate and other hydroxylated 
fatty acids from plants, numerous improvements may 
be envisioned that will increase the level of 
accumulation of hydroxylated fatty acids in plants 

10 that express the fahl2 or related hydroxylase genes. 
Improvements in the level and tissue specificity of 
expression of the hydroxylase gene are envisioned. 
Methods to accomplish this by the use of strong, 
seed- specif ic promoters such as the B, nstpus napin 

15 promoter will be obvious to one skilled in the art. 
Additional improvements are envisioned that involve 
modification of the enzymes which cleave 
hydroxylated fatty acids from phosphatidylcholine, 
reduction in the activities of enzymes which degrade 

2 0 hydroxylated fatty acids and replacement of 

acyltransf erases which transfer hydroxylated fatty 
acids to the sn-1, sn-2 and sn-3 positions of 
glycerolipids- Although genes for these enzymes have 
not been described in the scientific literature, 

25 their utility in improving the level of production 
of hydroxylated fatty acids can be readily 
appreciated based on the results of biochemical 
investigations of ricinoleate synthesis . 

Although Arabidopsis is not an economically 

30 important plant species, it is widely accepted by 
plant biologists as a model for higher plants - 
Therefore, the inclusion of this example is intended 
to demonstrate the general utility of the invention 
described here to the modification of oil 
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composition in higher plants. One advantage of 
studying the expression of this novel gene in 
Araihldopsis is the existence in this system of a 
large body of knowledge on lipid metabolism, as well 
5 as the availability of a collection of mutants which 
can be used to provide useful information on the 
biochemistry of fatty acid hydroxylation in plant 
species • Another advantage is the ease of 
transposing any of the information obtained on 

10 metabolism of ricinoleate in Arsihidopsls to closely 
related species such as the crop plants Brasslca 
napus, Brsisslca. juncesi or Crambe ethyssinlca. in order 
to mass produce ricinoleate, lesqueroleate or other 
hydroxylated fatty acids for industrial use . The 

15 kappa hydroxylase is useful for the production of 
ricinoleate or lesqueroleate in any plant species 
that accumulates significant levels of the 
precursors, oleic acid and icosenoic acid. Of 
particular interest are genetically modified 

20 varieties that accumulate high levels of oleic acid. 
Such varieties are currently available for sunflower 
and canola- Production of lesquerolic acid and 
related hydroxy fatty acids can be achieved in 
species that accumulate high levels of icosenoic 

25 acid or other long chain monoenoic acids. Such 
plants may in the future be produced by genetic 
engineering of plants that do not normally make such 
precursors. Thus, the use of the kappa hydroxylase 
will be of general utility. 
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EXAMPLE 2 . ISOLATION OF LESOUERELLA KAPPA 

HYDROXYLASE GENOMIC CLONE 

Overview 

Regions of nucleotide sequence that were 
5 conserved in both the castor kappa hydroxylase and 
the Arabidopsis fad2 A12 fatty acid desaturase were 
used to design oligonucleotide primers. These were 
used with genomic DNA from Lesquejrella fendlerl to 
amplify fragments of several homologous genes. These 

10 amplified fragments were then used as hybridization 
probes to identify full length genomic clones from a 
genomic library of L. fendl^ri. 

Hydroxylated fatty acids are specific to the 
seed tissue of Lesquerella sp., and are not found to 

15 any appreciable extent in vegetative tissues. One of 
the two genes identified by this method was 
expressed in both leaves and developing seeds and is 
therefore thought to correspond to the A12 fatty 
acid desaturase . The other gene was expressed at 

20 high levels in developing seeds but was not 

expressed or was expressed at very low levels in 
leaves and is the kappa hydroxylase from this 
species. The identity of the gene as a fatty acyl 
hydroxylase was established by functional expression 

25 of the gene in yeast. 

The identity of this gene will also be 
established by introducing the gene into transgenic 
AraJbidopsis plants and showing that it causes the 
accumulation of ricinoleic acid, lesquerolic acid, 

30 densipolic acid and auricolic acid in seed lipids. 

The various steps involved in this process 
are described in detail below. Unless otherwise 
indicated, routine methods for manipulating nucleic 
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acids, bacteria and phage were as described by 
Sambrook et al. (1989). 

Isolation of a fragment: of the LGScruerel la kaona 
hydroy^-l^^e gene 
5 Oligonucleotide primers for the amplification 

of the I#. fendleri kappa hydroxylase were designed 
by choosing regions of high deduced amino acid 
sequence homology between the castor kappa 
hydroxylase and the Arahidopsis A12 desaturase 

10 (fad2) . Because most amino acids are encoded by 
several different codons, these oligonucleotides 
were designed to encode all possible codons that 
could encode the corresponding amino acids. 

The sequence of these mixed oligonucleotides 

15 was Oligo 1: TAYWSNCAYMGNMGNCAYCA (SEQ ID NO: 14) and 
Oligo 2: RTGRTGNGCNACRTGNGTRTC (SEQ ID NO: 15) where 
Y = C+T, W = A+T, S = G+C, N = A+G+C+T, M = A+C, and 
R = A+G- 

These oligonucleotides were used to amplify a 
20 fragment of DNA from L. fendleri genomic DNA by the 
polymerase chain reaction (PGR) using the following 
conditions: Approximately 100 ng of genomic DNA was 
added to a solution containing 25 pmol of each 
primer, 1.5 U Taq polymerase (Boehringer Manheim) , 
25 200 uM of dNTPs, 50 mM KCl , 10 mM Tris . Cl (pH 9), 
0,1% (v/v) Triton X-100, 1 . 5 mM MgCls, 3% (v/v) 
formamide, to a final volume of 50 /xl • 
Amplifications conditions were: 4 min denaturation 
step at 94®C, followed by 30 cycles of 92«»C for 1 
30 min, SS^C for 1 min, 72°C for 2 min. A final 

extension step closed the program at 72**C for 5 min. 

PGR products c* approximately 54 0 bp were 
observed following 'e'i-=^cr rophoretic separation of the 
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products of the PGR reaction in agarose gels. Two of 
these fragments were cloned into pBluescript 
(Stratagene) to give rise to plasmids pLesq2 and 
pLesg3 . The sec[uence of the inserts in these two 
5 plasmids was determined by the chain termination 
method. The sequence of the insert in pLesq2 is 
presented as Figure 5 (SEQ ID NO:l) and the secjuence 
of the insert in pLesq3 is presented as Figure 6 
(SEQ ID NO: 2) . The high degree of sequence identity 
10 between the two clones indicated that they were both 
potential candidates to be either a A12 desaturase 
or a kappa hydroxylase. 

Northern analysis 

In I#. fendleri, hydroxy lated fatty acids are 

15 found in large amounts in seed oils but are not 

found in appreciable amounts in leaves . An important 
criterion in discriminating between a fatty acyl 
desaturase and kappa hydroxylase is that the kappa 
hydroxylase gene is expected to be expressed more 

20 highly in tissues which have high level of 

hydroxylated fatty acids than in other tissues. In 
contrast, all plant tissues should contain mRNA for 
an 0)6 fatty acyl desaturase since diunsaturated 
fatty acids are found in the lipids of all tissues 

25 in most or all plants. 

Therefore, it was of great interest to 
determine whether the gene corresponding to pLesq2 
was also expressed only in seeds, or is also 
expressed in other tissues. This question was 

3 0 addressed by testing for hybridization of pLesq2 to 
RNA purified from developing seeds and from leaves. 

Total RNA was purified from developing seeds 
and young leaves of L. fendleri using an Rneasy RNA 
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extraction kit (Qiagen) , according to the 
manufacturer's instructions. RNA concentrations were 
quantified by UV spectrophotometry at X=260 and 280 
nm. In order to ensure even loading of the gel to be 
5 used for Northern blotting, RNA concentrations were 
further adjusted after recording fluorescence under 
UV light of RNA samples stained with ethidium 
bromide and run on a test denaturing gel. 

Total RNA prepared as described above from 

10 leaves and developing seeds was electrophoresed 

through an agarose gel containing formaldehyde (Iba 
et al . , 1993) . An equal quantity {10 A^g) of RNA was 
loaded in both lanes, and RNA standards (0.16-1.77 
kb ladder, Gibco-BRL.) were loaded in a third lane. 

15 Following electrophoresis, RNA was transferred from 
the gel to a nylon membrane (Hybond N+, Amersham) 
and fixed to the filter by exposure to UV light. 

A ^^P- labelled probe was prepared from insert 
DNA of clone pLesq2 by random priming and hybridized 

20 to the membrane overnight at 52**C, after it had been 
prehybridized for 2 h. The prehybridization solution 
contained 5X SSC, lOX Denhardt's solution, 0.1% SDS, 
O.IM KPO4 pH 6.8, 100 yig/wX salmon sperm DNA. The 
hybridization solution had the same basic 

25 composition, but no SDS, and it contained 10% 

dextran sulfate and 3 0% formamide. The blot was 
washed once in 2X SSC, 0.5% SDS at 65«C then in IX 
SSC at the same temperature. 

Brief (30 min) exposure of the blot to X-ray 

30 film revealed that the probe pLiesq2 hybridized to a 
single band only in the seed RNA lane (Figure 7). 
The blot was re -probed with the insert from pLesqB 
gene, which gave bands of similar intensity in the 
seed and leaf lanes 'Figure 7) . 



wo 97/30582 



PCTAJS97/02187 



53 



These results show that the gene 
corresponding to the clone pLesq2 is highly and 
specifically expressed in seed of L. fendlerl. In 
conjunction with knowledge of the nucleotide and 
5 deduced amino acid sequence, strong seed- specif ic 
expression of the gene corresponding to the insert 
in pliesq2 is a convincing indicator of the role of 
the enzyme in synthesis of hydroxylated fatty acids 
in the seed oil. 

10 Characterization of a genomic clone of the kappa 
hydroxylase 

Genomic DNA was prepared from young leaves of 
L. fendier-i as described by Murray and Thompson 
(1980) . A 5au3AI -partial digest genomic library 

15 constructed in the vector XDashll {Stratagene, 11011 
North Torrey Pines Road, La Jolla CA 92 037) was 
prepared by partially digesting 500 pig of DNA, size- 
selecting the DNA on a sucrose gradient (Sambrook et 
al., 1989), and ligating the DNA (12 kb average 

20 size) to the SamKI -digested arms of XDashll. The 
entire ligation was packaged according to the 
manufacturer's conditions and plated on E. coll 
strain XLl-Blue MRA-P2 (Stratagene) . This yielded 
5x10^ primary recombinant clones. The library was 

25 then amplified according to the manufacturer's 

conditions. A fraction of the genomic library was 
plated on E. coli XLl-Blue and resulting plaques 
(150,000) were lifted to charged nylon membranes 
(Hybond N+, Amersham) , according to the 

30 manufacturer's conditions. DNA was crosslinked to 

the filters under UV in a Stratalinker (Stratagene) . 

Several clones carrying genomic sequences 
corresponding to the L. fendleri hydroxylase were 
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isolated by probing the membranes with the insert 
from pLesq2 that was PCR-amplif ied with internal 
primers and labelled with ^^P by random priming. The 
filters were prehybridized for 2 hours at 65 ®C in 7% 
5 SDS, ImM EDTA, 0.25 M NajHPO^ (pH 7.2), 1% BSA and 
hybridized to the probe for 16 hours in the same 
solution. The filters were sequentially washed at 
65®C in solutions containing 2 X SSC, 1 X SSC, 0.5 X 
SSC in addition to 0.1 % SDS . A 2 . 6 kb Xbal fragment 

10 containing the complete coding sequence for the 

kappa hydroxylase and approximately 1 kb of the 5' 
upstream region was subcloned into the corresponding 
site of pBluescript KS to produce plasmid pLesq-Hyd 
and the sequence determined completely using an 

15 automatic sequencer by the dideoxy chain termination 
method. Sequence data was analyzed using the program 
DNASIS (Hitachi Company) , 

The sequence of the insert in clone pLesq-Hyd 
is shown in Figures 8A-B. The sequence entails 1855 

20 bp of contiguous DNA sequence (SEQ ID NO: 3) . The 

clone encodes a 401 bp 5' untranslated region (i.e., 
nucleotides preceding the first ATG codon) , an 1152 
bp open reading frame, and a 302 bp 3' untranslated 
region. The open reading frame encodes a 3 84 amino 

25 acid protein with a predicted molecular weight of 
44,370 (SEQ ID NO : 4 ) . The amino terminus lacks 
features of a typical signal peptide (von Heijne, 
1985) . 

The exact translation- initiation methionine 
3 0 has not been experimentally determined, but on the 

basis of deduced amino acid sequence homology to the 
castor kappa hydroxylase (noted below) is thought to 
be the methionine encoded by the first ATG codon at 
nucleotide 402. 
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Comparison of the pLesq-Hyd deduced amino 
acid sequence with sequences of membrane -bound 
desaturases and the castor hydroxylase (Figures 9A- 
B) indicates that pLesq-Hyd is homologous to these 
5 genes. This figure shows an alignment of the L. 

fendleri hydroxylase (SEQ ID NO: 4) with the castor 
hydroxylase (van de Loo et al., 1995), the 
Arabidopsls fad2 cDNA which encodes an endoplasmic 
ret iculum-localized A12 desaturase (called fad2) 
10 (Okuley et al . , 1994), two soybean fad2 desaturase 

clones, a Brasslcat nstpus fad2 clone, a Zea mays fad2 
clone and partial sequence of a i^. conunimls fad2 
clone . 

The high degree of sequence homology 
15 indicates that the gene products are of similar 

function. For instance, the overall homology between 
the LesqueT^ll^i hydroxylase and the Arabxdopsis fad2 
desaturase was 92.2% similarity and 84.8% identity 
and the two sequences differed in length by only one 
20 amino acid. 

southern hybridization 

Southern analysis was used to examine the 
copy number of the genes in the L. fendleri genome 
corresponding to the clone pLesq-Hyd, Genomic DNA (5 

25 /xg) was digested with £coRI, Hindlll and Xbal and 
separated on a 0.9% agarose gel. DNA was alkali- 
blotted to a charged nylon membrane (Hybond N-f, 
Amersham) , according to the manufacturer's protocol. 
The blot was prehybridized for 2 hours at 65 ®C in 7% 

30 SDS, ImM EDTA, 0.25 M NazHPO^ (pH 7.2), 1% BSA and 
hybridized to the probe for 16 hours in the same 
solution with pLesq-Hyd insert PCR-amplif ied with 
internal primers and labelled with ^^P by random 
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priming. The filters were sec[uentially washed at 
65^C in solutions containing 2 X SSC, 1 X SSC, 0.5 X 
BSC in addition to 0 . 1 % SDS, then exposed to X-ray 
film. 

5 The probe hybridized with a single band in 

each digest of li. fendleri DNA {Figure 10), 
indicating that the gene from which pLesq-Hyd was 
transcribed is present in a single copy in the I/. 
f endl eri genome . 

10 Expressi on of pLesa-Hvd in Transgenic Plants 

There are a wide variety of plant promoter 
sequences which may be used to cause tissue -specific 
expression of cloned genes in transgenic plants. For 
instance, the napin promoter and the acyl carrier 

15 protein promoters have previously been used in the 
modification of seed oil composition by expression 
of an antisense form of a desaturase (Knutson et 
al., 1992). Similarly, the promoter for the ^- 
subunit of soybean /3-conglycinin has been shown to 

20 be highly active and to result in tissue-specific 
expression in transgenic plants of species other 
than soybean (Bray et al . , 1987). Thus, other 
promoters which lead to seed- specif ic expression may 
also be employed for the production of modified seed 

25 oil composition. Such modifications of the invention 
described here will be obvious to one skilled in the 
art . 

Constructs for expression of L. fendleri 
kappa hydroxylase in plant cells are prepared as 
30 follows: A 13 kb Sail fragment containing the pLesq- 
Hyg gene was ligated into the Xhol site of binary Ti 
plasmid vector pSLJ44026 (Jones et al., 1992) 
(Figure 11) to produce plasmid pTi-Hyd and 
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transformed into Acfrobacterium tumefaciens strains 
GV3101 by electroporation. Strain GV3101 (Koncz and 
Schell, 1986) contains a disarmed Ti plasmid. Cells 
for electroporation were prepared as follows. GV3101 
5 was grown in LB medium with reduced NaCl (5 g/1) - A 
250 ml culture was grown to ODgoo = 0,6, then 
centrifuged at 4000 rpm (Sorvall GS-A rotor) for 15 
min. The supernatant was aspirated immediately from 
the loose pellet, which was gently resuspended in 

10 500 ml ice-cold water. The cells were centrifuged as 
before, resuspended in 30 ml ice-cold water, 
transferred to a 30 ml tube and centrifuged at 5000 
rpm (Sorvall SS-34 rotor) for 5 min. This was 
repeated three times > resuspending the cells 

15 consecutively in 30 ml ice-cold water, 30 ml ice- 
cold 10% glycerol, and finally in 0.75 ml ice-cold 
10% glycerol. These cells were aliquoted, frozen in 
liquid nitrogen, and stored at -80®C. 

Electroporations employed a Biorad Gene 

2 0 Pulser instrument using cold 2 mm-gap cuvettes 

containing 4 0 /xl cells and 1 fil of DNA in water, at 
a voltage of 2.5 KV, and 200 Ohms resistance. The 
electroporated cells were diluted with 1 ml SOC 
medium (Sambroo)c et al . , 1989, page A2 ) and 
25 incubated at 28 for 2-4 h before plating on medium 
containing kanamycin (50 mg/1) . 

ArathldLopsls thaliamt can be transformed with 
the A^rohactzerxuin cells containing pTi-Hyd as 
described in Example 1 above. Similarly, the 

3 0 presence of hydroxylated fatty acids in the 

transgeneic Arahidopsis plants can be demonstrated 
by the methods described in Example 1 above. 
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Constitutive expression of the L. fendleri 
hydroxylase in transgenic plants 

A 1.5 kb KcoRI fragment from pLesq-Hyg 
comprising the entire coding region of the 
5 hydroxylase was gel purified, then cloned into the 
corresponding site of pBluescript KS (Stratagene) . 
Plasmid DNA from a number of recombinant clones was 
then restricted with PstI, which should cut only 
once in the insert and once in the vector polylinker 

10 sequence. Release of a 920 bp fragment with PstI 
indicated the right orientation of the insert for 
further manipulations. DNA from one such clone was 
further restricted with Sail, the 5' overhangs 
filled- in with the Klenow fragment of DNA polymerase 

15 I, then cut with Sad. The insert fragment was gel 

purified, and cloned between the Smal and Sad sites 
of pBI121 (Clontech) behind the cauliflower mosaic 
virus 3 5S promoter. After checking that the sequence 
of the junction between insert and vector DNA was 

20 appropriate, plasmid DNA from a recombinant clone 
was used to transform A. tumefaclens (GV3101) . 
Kanamycin resistant colonies were then used for in 
planta transformation of A- thaliana as previously 
described. 

25 DNA was extracted from kanamycin resistant 

seedlings and used co PCR-amplify selected fragments 
from the hydroxylase using nested primers. When 
fragments of the expected size could be amplified, 
corresponding plants were grown in the greenhouse or 

3 0 on agar plates, and fatty acids extracted from fully 
expanded leaves, roots and dry seeds, GC-MS analysis 
was then performed as previously described to 
characterize the different fatty acid species and 
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detect accumulation of hydroxy fatty acids in 
transgenic tissues . 

Expression of the Lescnierella hydroxylase in veast 
In order to demonstrate that the cloned L. 
5 rendleri gene encoded a kappa hydroxylase, the gene 
was expressed in yeast cells under transcriptional 
control of an inducible promoter and the yeast cells 
were examined for the presence of hydroxylated fatty 
acids by GC-MS. 

10 In a first step, a lambda genomic clone 

containing the L. fendlerl hydroxylase gene was cut 
with £:coRI, and a resulting 1400 bp fragment 
containing the coding sequence of the hydroxylase 
gene was subcloned in the EcoRI site of the 

15 pBluescript KS vector (Stratagene) . This subclone, 
pLesqcod, contains the coding region of the 
I/esgxierella hydroxylase plus some additional 3' 
sequence . 

In a second step, pLesqcod was cut with 
20 Hindlll and Xbal, and the insert fragment was cloned 
into the corresponding sites of the yeast expression 
vector pYes2 {Invitrogen; Figure 12). This subclone, 
pLesqYes, contains the L. fendlerl hydroxylase in 
the sense orientation relative to the 3' side of the 
25 Gall promoter. This promoter is inducible by the 

addition of galactose to the growth medium, and is 
repressed upon addition of glucose. In addition, the 
vector carries origins of replication allowing the 
propagation of pLesqYes in both yeast and E. coll. 

3 0 Transformation of S. cerevlslae host strain CGY2557 
Yeast strain CGY2557 (MATa, GAL*, ura3-52, 
Ieu2-3, trpl, ade2'l, lys2'l, hls5 , csknl-100) was 
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grown overnight at 28**C in YPD liquid medium (10 g 
yeast extract, 20 g bacto-peptone , 20 g dextrose per 
liter) , and an aliquot of the culture was inoculated 
into 10 0 ml fresh YPD medium and grown until the 
5 ODgoo of the culture was 1. Cells were then collected 
by centrif ugation and resuspended in about 200^1 of 
supernatant. 40/il aliquots of the cell suspension 
were then mixed with 1-2/xg DNA and electroporated in 
2 mm-gap cuvettes using a Biorad Gene Pulser 

10 instrument set at 600 V, 200 Q, 25 /xF, 160/il YPD was 
added and the cells were plated on selective medium 
containing glucose. Selective medium consisted of 
6.7 g yeast nitrogen base (Dif co) , 0.4 g casamino 
acids (Dif CO) , 0 . 02 g adenine sulfate, 0.03 g L- 

15 leucine, 0 . 02 g L- tryptophan, 0.03 g L-lysine-HCl , 
0.03 g Li-histidine-HCl , 2% glucose, water to 1 
liter. Plates were solidified using 1.5% Difco 
Bacto-agar. Transformant colonies appeared after 3 
to 4 days incubation at 28<*C. 

20 Expression of the L. fendleri hydroxylase in veast 
Independent transformant colonies from the 

previous experiment were used to inoculate 5 ml of 

selective medium containing either 2% glucose (gene 

repressed) or 2% galactose (gene induced) as the 
25 sole carbon source. Independent colonies of CGY25S7 

transformed with pYES2 containing no insert were 

used as controls . 

After 2 days of growth at 28® an aliquot of 

the cultures was used to inoculate 5 ml of fresh 
30 selective medium. The new culture was placed at IS^C 

and grown for 9 days . 
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Fatty acid analysis of veast expressing the L. 
fendl ez-j hydroxylase 

Cells from 2.5 ml of culture were pelleted at 
18 0 0g, and the supernatant was aspirated as 
5 completely as possible- Pellets were then dispersed 
in 1 ml of 1 N methanolic HCl (Supelco, Bellafonte, 
PA) . Transmethylation and derivatization of hydroxy 
fatty acids were performed as described above. After 
drying under nitrogen, samples were redissolved in 

10 50/xl chloroform before being analyzed by GC-MS. 

Samples were injected into an SP2330 fused-silica 
capillary column (30 m x 0.25 mm ID, 0.25/1^1 film 
thickness, Supelco) , The temperature profile was 100 
- leO^C, 25«>C/min, 160 - 230<>C, lO^C/min, 230*C, 3 

15 min, 230-100«C, 25*»C/min. Flow rate was 0.9 ml/min. 
Fatty acids were analyzed using a Hewlett-Packard 
5971 series Msdetector. 

Gas chromatograms of derivatized fatty acid 
methyl esters from induced cultures of yeast 

2 0 containing pLesqYes contained a novel peak that 
eluted at 7.6 min (Figure 13). 0-TMS methyl 
ricinoleate eluted at exactly the same position on 
control chromatograms. This peak was not present in 
cultures lacking pLesqYes or in cultures containing 

2 5 pLesqYes grown on glucose (repressing conditions) 

rather than galactose (inducing conditions) . Mass 
spectrometry of the peak (Figure 13) revealed that 
the peak has the same spectrum as O-TMS methyl 
ricinoleate. Thus, on the basis of chromatographic 

3 0 retention time and mass spectrum, it was concluded 

that the peak corresponded to O-TMS methyl 
ricinoleate. The presence of ricinoleate in the 
transgenic yeast cultures confirms the identity of 
the gene as a kappa hydroxylase of this invention. 
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EXAMPLE 3 . OBTAINING OTHER PLANT FATTY ACYL 
HYDROXYLASES 

The castor fahl2 secjuence could be used to 
identify other kappa hydroxylases by methods such as 
5 PGR and heterologous hybridization. However, because 
of the high degree of sequence similarity between 
A12 desaturases and kappa hydroxylases, the prior 
art does not teach how to distinguish between the 
two kinds of enzymes without a functional test such 

10 as demonstrating activity in transgenic plants or 

another suitable host (e.g., transgenic microbial or 
animal cells) . The identification of the L. fendleri 
hydroxylase provided for the development of criteria 
by which a hydroxylase and a desaturase may be 

15 distinguished solely on the basis of deduced amino 
acid sequence information - 

Figures 9A-B show a sequence alignment of the 
castor and L. fGndlGri hydroxylase sequences with 
the castor hydroxylase sequence and all publicly 

20 available sequences for all plant microsomal A12 
fatty acid desaturases. Of the 384 amino acid 
residues in the castor hydroxylase sequence, more 
than 9 5% are identical to the corresponding residue 
in at least one of the desaturase sequences . 

25 Therefore, none of these residues are responsible 
for the catalytic differences between the 
hydroxylase and the desaturases. Of the remaining 16 
residues in the castor hydroxylase and 14 residues 
in the LescjaerBllai hydroxylase, all but seven 

3 0 represent instances where the hydroxylase sequence 

has a conservative substitution compared with one or 
more of the desaturase sequences, or there is wide 
variability in the amino acid at that position in 
the various desaturases. By conservative, it is 
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meant that the following amino acids are 
functionally equivalent: Ser/Thr, Ile/Leu/Val/Met , 
Asp/Glu. Thus, these structural differences also 
cannot account for the catalytic differences between 
5 the desaturases and hydroxylases. This leaves just 
seven amino acid residues where both the castor 
hydroxylase and the Lesquerella hydroxylase differ 
from all of the known desaturases and where all of 
the known microsomal A12 desaturases have the 

10 identical amino acid residue. These residues occur 
at positions 69, 111, 155, 226, 304, 331 and 333 of 
the alignment in Figure 9. Therefore, these seven 
sites distinguish hydroxylases from desaturases. 
Based on this analysis, the present inventors 

15 believe that any enzyme with greater than 60% 

sequence identity to one of the enzymes listed in 
Figure 9 can be classified as a hydroxylase if it 
differs from the sequence of the desaturases at 
these seven positions. Because of slight differences 

20 in the number of residues in a particular protein, 
the numbering may vary from protein to protein but 
the intent of the number system will be evident if 
the protein in question is aligned with the castor 
hydroxylase using the numbering system shown herein. 

25 Thus, in conjunction with the methods for using the 
LesguGrella hydroxylase gene to isolate homologous 
genes, the structural criterion disclosed here 
teaches how to isolate and identify plant kappa 
hydroxylase genes for the purpose of genetically 

3 0 modifying fatty acid composition as disclosed 
herein . 



wo 97/M582 



PCT/US97/02187 



64 



EXAMPLE 4 - USING HYDROXYLASES TO TOTTER THE LEVEL OF 
FATTY ACID UNSATURATION 

Evidence that kappa hydroxylases of this 
invention can be used to alter the level of fatty 
5 acid unsaturation was obtained from the analysis of 
transgenic plants that expressed the castor 
hydroxylase under control of the cauliflower mosaic 
virus promoter. The construction of the plasmids and 
the production of transgenic Arabidopsis plants was 
10 described in Example 1 (above) . The fatty acid 

composition of seed lipids from wild type and six 
transgenic lines {1-2/a, 1-2/b, 1-3/b, 4F, 7E, 7F} 
is shown in Table 2 . 

Table 2. Fatty acid composition of lipids from 
15 Arahidapsis seeds. The asterisk (*) indicates that 
for some of these samples, the 18:3 and 20:1 peaks 
overlapped on the gas chromatograph and, therefore, 
the total amount of these two fatty acids is 
reported. 
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The results in Table 2 show that expression 
of the castor hydroxylase in transgenic Arahldopsls 
plants caused a substantial increase in the amount 
of oleic acid (18:1) in the seed lipids and an 
5 approximately corresponding decrease in the amount 

of linoleic acid (18:2). The average amount of oleic 
acid in the six transgenic lines was 29.9% versus 
14.7% in the wild type. 

The precise mechanism by which expression of 

10 the castor hydroxylase gene causes increased 

accumulation of oleic acid is not known. However, an 
understanding of the mechanism is not required in 
order to exploit this invention for the directed 
alteration of plant lipid fatty acid composition. 

15 Furthermore, it will be recognized by one skilled in 
the art that many improvements of this invention may 
be envisioned. Of particular interest will be the 
use of other promoters which have high levels of 
seed-specific expression . 

20 Since hydroxylated fatty acids were not 

detected in the seed lipids of transgenic line l-2b, 
it seems likely that it is not the presence of 
hydroxylated fatty acids per se that causes the 
effect of the castor hydroxylase gene on desaturase 

25 activity. Protein-protein interaction between the 

hydroxylase and the A12-oleate desaturase or another 
protein may be required for the overall reaction 
(e.g., cytochrome b5) or for the regulation of 
desaturase activity. For example, interaction 

3 0 between the hydroxylase and this other protein may 
suppress the activity of the desaturase. In 
particular, the quaternary structure of the 
membrane -bound desaturases has not been established. 
It is possible that these enzymes are active as 
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dimers or as multimeric complexes containing more 
than two subunits. Thus, if dimers or multimers form 
between the desaturase and the hydroxylase, the 
presence of the hydroxylase in the complex may 
5 disrupt the activity of the desaturase. 

Transgenic plants may be produced in which 
the hydroxylase enzyme has been rendered inactive by 
the elimination of one or more of the histidine 
residues that have been proposed to bind iron 

10 molecules required for catalysis. Several of these 
histidine residues have been shown to be essential 
for desaturase activity by site directed mutagenesis 
(Shanklin et al . , 1994). Codons encoding histidine 
residues in the castor hydroxylase gene will be 

15 changed to alanine residues as described by Shanklin 
et al . (1994). The modified genes will be introduced 
into transgenic plants of Arahidopsis , and possibly 
other species such as tobacco, by the methods 
described in Example 1 of this application. 

2 0 In order to examine the effect on all 

tissues, the strong constitutive cauliflower mosaic 
virus promoter may be used to cause transcription of 
the modified genes. However, it will be recognized 
that in order to specifically examine the effect of 

25 expression of the mutant gene on seed lipids, a 

seed-specific promoter such as the S. napus napin 
promoter may be used. An expected outcome is that 
expression of the inactive hydroxylase protein in 
transgenic plants will inhibit the activity of the 

30 endoplasmic reticulum-localized A12-desaturase . 

Maximum inhibition will be obtained by expressing 
high levels of the mutant protein. 

In a further embodiment of this invention, 
mutations that inactivate other hydroxylases, such 
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as the LesQuerGlla hydroxylase of this invention, 
may also be useful for decreasing the amount of 
endoplasmic reticulum-localized A12-desaturase 
activity in the same way as the castor gene. In a 
5 further embodiment of this invention, similar 

mutations of desaturase genes may also be used to 
inactivate endogenous desaturases. Thus, expression 
of catalytically inactive fad2 gene from Areihldapsls 
in transgenic Arahldopsis may inhibit the activity 

10 of the endogenous fad2 gene product. 

Similarly, expression of the catalytically 
inactive forms of A12 -desaturase from Arahidapsls or 
other plants in transgenic soybean, rapeseed, 
CrsLnihB, Bretssicat juncea, canola, flax, sunflower, 

15 saf flower, cotton, cuphea, soybean, peanut, coconut, 
oil palm or corn may lead to inactivation of 
endogenous A12 -desaturase activity in these plants. 
In a further embodiment of this invention, 
expression of catalytically inactive forms of other 

20 desaturases such as the A15 -desaturases may lead to 
inactivation of the corresponding desaturases. 

An example of a class of mutants useful in 
the present invention are "dominant negative" 
mutants that block the function of a gene at the 

25 protein level (Herskowitz, 1987) . A cloned gene is 
altered so that it encodes a mutant product capable 
of inhibiting the wild type gene product in a cell, 
thus causing the cell to be deficient in the 
function of that gene product- Inhibitory variants 

30 of a wild type product can be designed because 

proteins have multiple functional domains that can 
be mutated independently, e.g., oligomerizaticn, 
substrate binding, canalysis, membrane association 
domains or the like. In general, dominant negative 
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proteins retain an intact, functional subset of the 
domains of the parent, wild type protein, but have 
the complement of that subset either missing or 
altered so as to be nonfunctional. 
5 Whatever the precise basis for the inhibitory- 

effect of the castor hydroxylase on desaturation, 
because the castor hydroxylase has very low 
nucleotide sequence homology (i.e., about 67%) to 
the Arabidopsls fad2 gene (encoding the endoplasmic 

10 reticulum-localized A12 -desaturase ) , the inhibitory 
effect of this gene, which is provisionally called 
"protein-mediated inhibition" ( "protibition" ) , may 
have broad utility because it does not depend on a 
high degree of nucleotide sequence homology between 

15 the transgene and the endogenous target gene. In 
particular, the castor hydroxylase may be used to 
inhibit the endoplasmic reticulum- localized A12- 
desaturase activity of all higher plants. Of 
particular relevance are those species used for oil 

20 production. These include but are not limited to 
rape seed, Crambe, Brass lea juncea, canola, flax, 
sunflower, saf flower, cotton, cuphea, soybean, 
peanut, coconut, oil palm and corn. 

CONCLUDING REMARKS 

2 5 By the above examples, demonstration of 

critical factors in the production of novel 
hydroxylated fatty acids by expression of a kappa 
hydroxylase gene from castor in transgenic plants is 
described. In addition, a complete cDNA sequence of 

3 0 the Lesguerella fendleri kappa hydroxylase is also 

provided. A full sequence of the castor hydroxylase 
is also given with various constructs for use in 
host cells. Through this invention, one can obtain 



wo 97/30582 



PCT/US97/02187 



70 



the amino acid and nucleic acid sequences which 
encode plant fatty acyl hydroxylases from a variety 
of sources and for a variety of applications . Also 
revealed is a novel method by which the level of 
5 fatty acid desaturation can be altered in a directed 
way through the use of genetically altered 
hydroxylase or desaturase genes . 

All publications mentioned in this 
specification are indicative of the level of skill 

10 of those skilled in the art to which this invention 
pertains. All publications are herein incorporated 
by reference to the same extent as if each 
individual publication was specifically and 
individually indicated to be incorporated by 

15 reference. 

Although the foregoing invention has been 
described in some detail by way of illustration and 
example for purposes of clarity of understanding, it 
will be obvious that certain changes and 

2 0 modifications may be practiced within the scope of 
the appended claims. 
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SEQUENCE LISTING 
(1) GENERAL INFORMATION: 

(i) APPLICANT: Somerville, Chris 

Broun, Pierre 
van de Loo, Frank 
Boddupal 1 i , Sekhar S . 

(ii) TITLE OF INVENTION: Production of Hydroxylated 
Fatty Acids in Genetically Modified Plants 

(iii) NUMBER OF SEQUENCES: 15 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: PILLSBURY MADISON Sc SUTRO 

(B) STREET: 1100 NEW YORK AVENUE, N.W. 

(C) CITY: WASHINGTON 

(D) STATE: D.C. 

(E) COUNTRY: USA 
<F) ZIP: 20005-3918 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: 3.5 inch, 1-44 MB storage 

(B) COMPUTER: IBM compatible 

(C) OPERATING SYSTEM: DOS 5.0 

(D) SOFTWARE: Word Perfect 5.1 

(vi) CURRENT APPLICATION DATA; 

(A) APPLICATION NUMBER: not yet assigned 

(B) FILING DATE: February 6, 1997 

(C) CLASSIFICATION: 



(2) INFORMATION FOR SEQ ID NO:l 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 543 nucleotides 

(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 i near 

(xi) SEQUGENCE DESCRIPTION: SEQ ID NO : 1 : 

TATTGGCACC GGCGGCACCA TTCCAACAAT GGATCCCTAG 4 0 

AAAAAGATGA AGTCTTTGTC CCACCTAAGA AAGCTGCAGT 80 

CANATGGTAT GTCAAATACC TCAACAACCC TCTTGGACGC 12 0 

ATTCTGGTGT TAACAGTTCA GTTTATCCTC GGGTGGCCTT 160 
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TGTATCTAGC 


CTTTAATGTA 


TCAGGTAGAC 


CTTATGATGG 


200 


TTTCGCTTCA 


CATTTCTTCC 


CTCATGCACC 


TATCTTTAAG 


240 


GACCGTGAAC 


GTCTCCAGAT 


ATACATCTCA 


GATGCTGGTA 


280 


TTCTAGCTGT 


CTGTTATGGT 


CTTTACCGTT 


ACGCTGCTTC 


320 


ACAAGGATTG 


ACTGCTATGA 


TCTGCGTCTA 


CGGAGTACCG 


360 


CTTTTGATAG 


TGAACTTTTT 


CCTTGTCTTG 


GTCACTTTCT 


400 


TGCAGCACAC 


TCATCCTTCA 


TTACCTCACT 


ATGATTCAAC 


440 


CGAGTGGGAA 


TGGATTAGAG 


GAGCTTTGGT 


TACGGTAGAC 


480 


AGAGACTATG 


GAATCTTGAA 


CAAGGTGTTT 


CACAACATAA 


520 


CAGACACCCA 


CGTAGCACAC 


CAC 




543 



(2) INFORMATION FOR SEQ ID NO: 2 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 544 nucleotides 

(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

TATAGGCACC GGAGGCACCA TTCCAACACA GGATCCCTCG 4 0 

AAAGAGATGA AGTATTTGTC CCAAAGCAGA AATCCGCAAT 80 

CAAGTGGTAC GGCGAATACC TCAACAACCC TCCTGGTCGC 12 0 

ATCATGATGT TAACTGTCCA GTTCGTCCTC GGATGGCCCT 160 

TGTACTTAGC CTTCAACGTT TCTGGCAGAC CCTACAATGG 200 

TTTCGCTTCC CATTTCTTCC CCAATGCTCC TATCTACAAC 240 

GACCGTGAAC GCCTCCAGAT TTACATCTCT GATGCTGGTA 28 0 

TTCTAGCCGT CTGTTATGGT CTTTACCGTT ACGCTGTTGC 320 

ACAAGGACTA GCCTCAATGA TCTGTCTAAA CGGAGTTCCG 3 60 

CTTCTGATAG TTAACTTTTT CCTCGTCTTG ATCACTTACT 4 00 
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TACAACACAC TCACCCTGCG TTGCCTCACT ATGATTCATC 



440 



AGAGTGGGA.T TGGCTTAGAG GAGCTTTAGC TACTGTAGAC 



480 



AGAGACTATG GAATCTTGAA CAAGGTGTTC CATAACATCA 



520 



CAGACACCCA CGTCGCACAC CACT 



544 



(2) INFORMATION FOR SEQ ID NO: 3 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1855 nucleotides 

(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 




ATGAAGCTTT 


ATAAGAAGTT 


AGTTTTCTCT 


GGTGACAGAG 


40 


AAATTNTGTC 


AATTGGTAGT 


GACAGTTGAA 


GCAACAGGAA 


80 


CAACAAGGAT 


GGTTGGTGNT 


GATGCTGATG 


TGGTGATGTG 


120 


TTATTCATCA 


AATACTAAAT 


ACTACATTAC 


TTGTTGCTGC 


160 


CTACTTCTCC 


TATTTCCTCC 


GCCACCCATT 


TTGGACCCAC 


200 


GANCCTTCCA 


TTTAAACCCT 


CTCTCGTGCT 


ATTCACCAGA 


240 


AGAGAAGCCA 


AGAGAGAGAG 


AGAGAGAATG 


TTCTGAGGAT 


280 


CATTGTCTTC 


TTCATCGTTA 


TTAACGTAAG 


TTTTTTTTGA 


320 


CCACTCATAT 


CTAAAATCTA 


GTACATGCAA 


TAGATTAATG 


360 


ACTGTTCCTT 


CTTTTGATAT 


TTTCAGCTTC 


TTGAATTCAA 


400 


GATGGGTGCT 


GGTGGAAGAA 


TAATGGTTAC 


CCCCTCTTCC 


440 


AAGAAATCAG 


AAACTGAAGC 


CCTAAAACGT 


GGACCATGTG 


480 


AGAAACCACC 


ATTCACTGTT 


AAAGATCTGA 


AGAAAGCAAT 


520 


CCCACAGCAT 


TGTTTCAAGC 


GCTCTATCCC 


TCGTTCTTTC 


560 


TCCTACCTTC 


TCACAGATAT 


CACTTTAGTT 


TCTTGCTTCT 


600 


ACTACGTTGC 


CACAAATTAC 


TTCTCTCTTC 


TTCCTCAGCC 


640 
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TCTCTCTACT 


TACCTAGCTT 


GGCCTCTCTA 


TTGGGTATGT 


680 


CAAGGCTGTG 


TCTTAACCGG 


TATCTGGGTC 


ATTGGCCATG 


720 


AATGTGGTCA 


CCATGCATTC 


AGTGACTATC 


AATGGGTAGA 


760 


TGACACTGTT 


GGTTTTATCT 


TCCATTCCTT 


CCTTCTCGTC 


800 


CCTTACTTCT 


CCTGGAAATA 


CAGTCATCGT 


CGTCACCATT 


840 


CCAACAATGG 


ATCTCTCGAG 


AAAGATGAAG 


TCTTTGTCCC 


880 


ACCGAAGAAA 


GCTGCAGTCA 


AATGGTATGT 


TAAATACCTC 


920 


AACAACCCTC 


TTGGACGCAT 


TCTGGTGTTA 


ACAGTTCAGT 


960 


TTATCCTCGG 


GTGGCCTTTG 


TATCTAGCCT 


TTAATGTATC 


1000 


AGGTAGACCT 


TATGATGGTT 


TCGCTTCACA 


TTTCTTCCCT 


1040 


CATGCACCTA 


TCTTTAAAGA 


CCGAGAACGC 


CTCCAGATAT 


1080 


ACATCTCAGA 


TGCTGGTATT 


CTAGCTGTCT 


GTTATGGTCT 


1120 


TTACCGTTAC 


GCTGCTTCAC 


AAGGATTGAC 


TGCTATGATC 


1160 


TGCGTCTATG 


GAGTACCGCT 


TTTGATAGTG 


AACTTTTTCC 


1200 


TTGTCTTGGT 


AACTTTCTTG 


CAGCACACTC 


ATCCTTCGTT 


1240 


ACCTCATTAT 


GATTCAACCG 


AGTGGGAATG 


GATTAGAGGA 


1280 


GCTTTGGTTA 


CGGTAGACAG 


AGACTATGGA 


ATATTGAACA 


1320 


AGGTGTTCCA 


TAACATAACA 


GACACACATG 


TGGCTCATCA 


1360 


TCTCTTTGCA 


ACTATACCGC 


ATTATAACGC 


AATGGAAGCT 


1400 


ACAGAGGCGA 


TAAAGCCAAT 


ACTTGGTGAT 


TACTACCACT 


1440 


TCGATGGAAC 


ACCGTGGTAT 


GTGGCCATGT 


ATAGGGAAGC 


1480 


AAAGGAGTGT 


CTCTATGTAG 


AACCGGATAC 


GGAACGTGGG 


1520 


AAGAAAGGTG 


TCTACTATTA 


CAACAATAAG 


TTATGAGGCT 


1560 


GATAGGGCGA 


GAGAAGTGCA 


ATTATCAATC 


TTCATTTCCA 


1600 


TGTTTTAGGT 


GTCTTGTTTA 


AGAAGCTATG 


CTTTGTTTCA 


1640 


ATAATCTCAG 


AGTCCATNTA 


GTTGTGTTCT 


GGTGCATTTT 


1680 
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GCCTAGTTAT 


GTGGTGTCGG 


AAGTTAGTGT 


TCAAACTGCT 


1720 


TCCTGCTGTG 


CTGCCCAGTG 


AAGAACAAGT 


TTACGTGTTT 


1760 


AAAATACTCG 


GAACGAATTG 


ACCACAANAT 


ATCCAAAACC 


1800 


GGCTATCCGA 


ATTCCATATC 


CGAAAACCGG 


ATATCCAAAT 


1840 


TTCCAGAGTA 


CTTAG 






1855 



(2) INFORMATION FOR SEQ ID NO : 4 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 384 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

Met. Gly Ala Gly Gly Arg lie Met Val Thr 

5 10 

Pro Ser Ser Lys Lys Ser Glu Thr Glu Ala 

15 20 

Leu Lys Arg Gly Pro Cys Glu Lys Pro Pro 

25 30 

Phe Thr Val Lys Asp Leu Lys Lys Ala lie 

35 40 

Pro Gin His Cys Phe Lys Arg Ser lie Pro 

45 50 

Arg Ser Phe Ser Tyr Leu Leu Thr Asp lie 

55 60 

Thr Leu Val Ser Cys Phe Tyr Tyr Val Ala 

65 70 

Thr Asn Tyr Phe Ser Leu Leu Pro Gin Pro 

75 80 

Leu Ser Thr Tyr Leu Ala Trp Pro Leu Tyr 

85 90 

Trp Val Cys Gin Gly Cys Val Leu Thr Gly 

95 100 
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lie Trp Val lie Gly His Glu Cys Gly His 

105 110 

His Ala Phe Ser Asp Tyr Gin Trp Val Asp 

115 120 

Asp Thr Val Gly Phe lie Phe His Ser Phe 

125 130 

Leu Leu Val Pro Tyr Phe Ser Trp Lys Tyr 

135 140 

Ser His Arg Arg His His Ser Asn Asn Gly 

145 150 

Ser Leu Glu Lys Asp Glu Val Phe Val Pro 

155 160 

Pro Lys Lys Ala Ala Val Lys Trp Tyr Val 

165 170 

Lys Tyr Leu Asn Asn Pro Leu Gly Arg lie 

175 180 

Leu Val Leu Thr Val Gin Phe lie Leu Gly 

185 190 

Trp Pro Leu Tyr Leu Ala Phe Asn Val Ser 

195 200 

Gly Arg Pro Tyr Asp Gly Phe Ala Ser His 

205 210 

Phe Phe Pro His Ala Pro lie Phe Lys Asp 

215 220 

Arg Glu Arg Leu Gin lie Tyr lie Ser Asp 

225 230 

Ala Gly lie Leu Ala Val Cys Tyr Gly Leu 

235 240 

Tyr Arg Tyr Ala Ala Ser Gin Gly Leu Thr 

245 250 

Ala Met lie Cys Val Tyr Gly Val Pro Leu 

255 260 

Leu lie Val Asn Phe Phe Leu Val Leu Val 

265 270 
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Thr Phe Leu Gin His Thr His Pro Ser Leu 

275 280 

Pro His Tyr Asp Ser Thr Glu Trp Glu Trp 

285 290 

lie Arg Gly Ala Leu Val Thr Val Asp Arg 

295 300 

Asp Tyr Gly lie Leu Asn Lys Val Phe His 

305 310 

Asn lie Thr Asp Thr His Val Ala His His 

315 320 

Leu Phe Ala Thr lie Pro His Tyr Asn Ala 

325 330 

Met Glu Ala Thr Glu Ala lie Lys Pro lie 

335 340 

Leu Gly Asp Tyr Tyr His Phe Asp Gly Thr 

345 350 

Pro Trp Tyr Val Ala Met Tyr Arg Glu Ala 

355 360 

Lys Glu Cys Leu Tyr Val Glu Pro Asp Thr 

365 370 

Glu Arg Gly Lys Lys Gly Val Tyr Tyr Tyr 

375 380 

Asn Asn Lys Leu 



(2) INFORMATION FOR SEQ ID NO ; 5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 87 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

( D ) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Met Gly Gly Gly Gly Arg Met Ser Thr Val 

5 10 

lie Thr Ser Asn Asn Ser Glu Lys Lys Gly 

15 20 
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Gly Ser Ser His Leu Lys Arg Ala Pro His 

25 30 

Thr Lys Pro Pro Phe Thr Leu Gly Asp Leu 

35 40 

Lys Arg Ala lie Pro Pro His Cys Phe Glu 

45 50 

Arg Ser Phe Val Arg Ser Phe Ser Tyr Val 

55 ^ 60 

Ala Tyr Asp Val Cys Leu Ser Phe Leu Phe 

65 70 

Tyr Ser lie Ala Thr Asn Phe Phe Pro Tyr 

75 80 

lie Ser Ser Pro Leu Ser Tyr Val Ala Trp 

85 90 

Leu Val Tyr Trp Leu Phe Gin Gly Cys lie 

95 100 

Leu Thr Gly Leu Trp Val lie Gly His Glu 

105 110 

Cys Gly His His Ala Phe Ser Glu Tyr Gin 

115 120 

Leu Ala Asp Asp lie Val Gly Leu lie Val 

125 130 

His Ser Ala Leu Leu Val Pro Tyr Phe Ser 

135 140 

Trp Lys Tyr Ser His Arg Arg His His Ser 

145 150 

Asn lie Gly Ser Leu Glu Arg Asp Glu Val 

155 160 

Phe Val Pro Lys Ser Lys Ser Lys lie Ser 

165 170 

Trp Tyr Ser Lys Tyr Ser Asn Asn Pro Pro 

175 180 

Gly Arg Val Leu Thr Leu Ala Ala Thr Leu 

185 190 
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Leu Leu Gly Trp Pro Leu Tyr Leu Ala Phe 

195 200 

Asn Val Ser Gly Arg Pro Tyr Asp Arg Phe 

205 210 

Ala Cys His Tyr Asp Pro Tyr Gly Pro lie 

215 220 

Phe Ser Glu Arg Glu Arg Leu Gin lie Tyr 

225 230 

lie Ala Asp Leu Gly lie Phe Ala Thr Thr 

235 240 

Phe Val Leu Tyr Gin Ala Thr Met Ala Lys 

245 250 

Gly Leu Ala Trp Val Met Arg lie Tyr Gly 

255 260 

Val Pro Leu Leu lie Val Asn Cys Phe Leu 

265 270 

Val Met lie Thr Tyr Leu Gin His Thr His 

275 280 

Pro Ala lie Pro Arg Tyr Gly Ser Ser Glu 

285 290 

Trp Asp Trp Leu Arg Gly Ala Met Val Thr 

295 300 

Val Asp Arg Asp Tyr Gly Val Leu Asn Lys 

305 310 

Val Phe His Asn He Ala Asp Thr His Val 

315 320 

Ala His His Leu Phe Ala Thr Val Pro His 

325 330 

Tyr His Ala Met Glu Ala Thr Lys Ala He 



Asp Gly Thr Pro Phe T/r Lys Ala Leu Trp 



335 



340 



Lys Pro He Met Gly Glu 

345 



Tyr Tyr Arg Tyr 
350 



355 



360 
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Arg Glu Ala Lys Glu Cys Leu Phe Val Glu 

365 370 

Pro Asp Glu Gly Ala Pro Thr Gin Gly Val 

375 380 

Phe Trp Tyr Arg Asn Lys Tyr 

385 

(2) INFORMATION FOR SEQ ID NO : 6 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 383 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

Met Gly Ala Gly Gly Arg Met Pro Val Pro 

5 10 

Thr Ser Ser Lys Lys Ser Glu Thr Asp Thr 

15 20 

Thr Lys Arg Val Pro Cys Glu Lys Pro Pro 

25 30 

Phe Ser Val Gly Asp Leu Lys Lys Ala lie 

35 40 

Pro Pro His Cys Phe Lys Arg Ser lie Pro 

45 50 

Arg Ser Phe Ser Tyr Leu lie Ser Asp lie 

55 60 

lie lie Ala Ser Cys Phe Tyr Tyr Val Ala 

65 70 

Thr Asn Tyr Phe Ser Leu Leu Pro Gin Pro 

75 80 

Leu Ser Tyr Leu Ala Trp Pro Leu Tyr Trp 

85 90 

Ala Cys Gin Gly Cys Val Leu Thr Gly lie 

95 100 

Trp Val lie Ala His Glu Cys Gly His His 

105 110 
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Ala Phe Ser Asp Tyr Gin Trp Leu Asp Asp 

115 120 

Thr Val Gly Leu lie Phe His Ser Phe Leu 

125 130 

Leu Val Pro Tyr Phe Ser Trp Lys Tyr Ser 

135 140 

His Arg Arg His His Ser Asn Thr Gly Ser 

145 150 

Leu Glu Arg Asp Glu Val Phe Val Pro Lys 

155 160 

Gin Lys Ser Ala lie Lys Trp Tyr Gly Lys 

165 170 

Tyr Leu Asn Asn Pro Leu Gly Arg lie Met 

175 180 

Met Leu Thr Val Gin Phe Val Leu Gly Trp 

185 190 

Pro Leu Tyr Leu Ala Phe Asn Val Ser Gly 

195 200 

Arg Pro Tyr Asp Gly Phe Ala Cys His Phe 

205 210 

Phe Pro Asn Ala Pro lie Tyr Asn Asp Arg 

215 220 

Glu Arg Leu Gin lie Tyr Leu Ser Asp Ala 

225 230 

Gly lie Leu Ala Val Cys Phe Gly Leu Tyr 

235 240 

Arg Tyr Ala Ala Ala Gin Gly Met Ala Ser 

245 250 

Met lie Cys Leu Tyr Gly Val Pro Leu Leu 

255 260 

lie Val Asn Ala Phe Leu Val Leu lie Thr 

265 270 

Tyr Leu Gin His Thr His Pro Ser Leu Pro 

275 280 
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His Tyr Asp Ser Ser Glu Trp Asp Trp Leu 

285 290 

Arg Gly Ala Leu Ala Thr Val Asp Arg Asp 

295 300 

Tyr Gly lie Leu Asn Lys Val Phe His Asn 

305 310 

lie Thr Asp Thr His Val Ala His His Leu 

315 320 

Phe Ser Thr Met Pro His Tyr Asn Ala Met 

325 330 

Glu Ala Thr Lys Ala lie Lys Pro lie Leu 

335 340 

Gly Asp Tyr Tyr Gin Phe Asp Gly Thr Pro 

345 350 

Trp Tyr Val Ala Met Tyr Arg Glu Ala Lys 

355 360 

Glu Cys lie Tyr Val Glu Pro Asp Arg Glu 

365 370 

Gly Asp Lys Lys Gly Val Tyr Trp Tyr Asn 

375 380 

Asn Lys Leu 



(2) INFORMATION FOR SEQ ID NO : 7 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 384 amino acids 

(B) TYPE: amino acid 
{ C ) STRJ\NDEDNESS : 

( D ) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 

Met Gly Ala Gly Gly Arg Met Gin Val Ser 

5 10 

Pro Pro Ser Lys Lys Ser Glu Thr Asp Asn 

15 20 

lie Lys Arg Val Pro Cys Glu Thr Pro Pro 

25 30 
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Phe Thr Val Gly Glu Leu Lys Lys Ala lie 

35 40 

Pro Pro His Cys Phe Lys Arg Ser lie Pro 

45 50 

Arg Ser Phe Ser His Leu lie Trp Asp lie 

55 60 

lie lie Ala Ser Cys Phe Tyr Tyr Val Ala 

65 70 

Thr Thr Tyr Phe Pro Leu Leu Pro Asn Pro 

75 80 

Leu Ser Tyr Phe Ala Trp Pro Leu Tyr Trp 

85 90 

Ala Cys Gin Gly Cys Val Leu Thr Gly Val 

95 100 

Trp Val lie Ala His Glu Cys Gly His Ala 

105 110 

Ala Phe Ser Asp Tyr Gin Trp Leu Asp Asp 

115 120 

Thr Val Gly Leu lie Phe His Ser Phe Leu 

125 130 

Leu Val Pro Tyr Phe Ser Trp Lys Tyr Ser 

135 140 

His Arg Arg His His Ser Asn Thr Gly Ser 

145 150 

Leu Glu Arg Asp Glu Val Phe Val Pro Arg 

155 160 

Arg Ser Gin Thr Ser Ser Gly Thr Ala Ser 

165 170 

Thr Ser Thr Thr Phe Gly Arg Thr Val Met 

175 180 

Leu Thr Val Gin Phe Thr Leu Gly Trp Pro 

185 190 

Leu Tyr Leu Ala Phe Asn Val Ser Gly Arg 

195 200 
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Pro Tyr Asp Gly Gly Phe Ala Cys His Phe 

205 210 

His Pro Asn Ala Pro lie Tyr Asn Asp Arg 

215 220 

Glu Arg Leu Gin lie Tyr lie Ser Asp Ala 

225 230 

Gly lie Leu Ala Val Cys Tyr Gly Leu Leu 

235 240 

Pro Tyr Ala Ala Val Gin Gly Val Ala Ser 

245 250 

Met Val Cys Phe Leu Arg Val Pro Leu Leu 

255 260 

lie Val Asn Gly Phe Leu Val Leu lie Thr 

265 270 

Tyr Leu Gin His Thr His Pro Ser Leu Pro 

275 280 

His Tyr Asp Ser Ser Glu Trp Asp Trp Leu 

285 290 

Arg Gly Ala Leu Ala Thr Val Asp Arg Asp 

295 300 

Tyr Gly lie Leu Asn Gin Gly Phe His Asn 

305 310 

lie Thr Asp Thr His Glu Ala His His Leu 

315 320 

Phe Ser Thr Met Pro His Tyr His Ala Met 

325 330 

Glu Ala Thr Lys Ala lie Lys Pro lie Leu 

335 340 

Gly Glu Tyr Tyr Gin Phe Asp Gly Thr Pro 

345 350 

Val Val Lys Ala Met Trp Arg Glu Ala Lys 

355 360 

Glu Cys lie Tyr Val Glu Pro Asp Arg Gin 

365 370 
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Gly Glu Lys Lys Gly Val Phe Trp Tyr Asn 

375 380 

Asn Lys Leu Xaa 



(2) INFORMATION FOR SEQ ID NO; 8 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 9 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 

Ser Leu Leu Thr Ser Phe Ser Tyr Val Val 

5 10 

Tyr Asp Leu Ser Phe Ala Phe lie Phe Tyr 

15 20 

lie Ala Thr Thr Tyr Phe His Leu Leu Pro 

25 30 

Gin Pro Phe Ser Leu lie Ala Trp Pro lie 

35 40 

Tyr Trp Val Leu Gin Gly Cys Leu Leu Thr 

45 50 

Arg Val Cys Gly His His Ala Phe Ser Lys 

55 60 

Tyr Gin Trp Val Asp Asp Val Val Gly Leu 

65 70 

Thr Leu His Ser Thr Leu Leu Val Pro Tyr 

75 80 

Phe Ser Trp Lys lie Ser His Arg Arg His 

85 90 

His Ser Asn Thr Gly Ser Leu Asp Arg Asp 

95 100 

Glu Arg Val Lys Val Ala Trp Phe Ser Lys 

105 110 

Tvr Leu Asn Asn Pro Leu Gly Arg Ala Val 

115 120 



wo 97/30582 



PCT/US97y02187 



90 



Ser Leu Leu Val Thr Leu Thr lie Gly Trp 

125 130 

Pro Met Tyr Leu Ala Phe Asn Val Ser Gly 

135 140 

Arg Pro Tyr Asp Ser Phe Ala Ser His Tyr 

145 150 

His Pro Tyr Arg Val Arg Leu Leu lie Tyr 

155 160 

Val Ser Asp Val Ala Leu Phe Ser Val Thr 

165 170 

Tyr Ser Leu Tyr Arg Val Ala Thr Leu Lys 

175 180 

Gly Leu Val Trp Leu Leu Cys Val Tyr Gly 

185 190 

Val Pro Leu Leu lie Val Asn Gly Phe Leu 

195 200 

Val Thr lie Thr Tyr Leu Arg Val His Tyr 

205 210 

Asp Ser Ser Glu Trp Asp Trp Leu Lys Gly 

215 220 

Ala Leu Ala Thr Met Asp Arg Asp Tyr Gly 

225 230 

lie Leu Asn Lys Val Phe His His lie Thr 

235 240 

Asp Thr His Val Ala His His Leu Phe Ser 

245 250 

Thr Met Pro His Tyr His Leu Arg Val Lys 

255 260 

Pro lie Leu Gly Glu Tyr Tyr Gin Phe Asp 

265 270 

Asp Thr Pro Phe Tyr Lys Ala Leu Trp Arg 

275 280 

Glu Ala Arg Glu Cys Leu Tyr Val Glu Pro 

285 290 
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Asp Glu Gly Thr Ser Glu Lys Gly Val Tyr 

295 300 

Trp Tyr Arg Asn Lys Tyr Leu Arg Val 

305 

(2) INFORMATION FOR SEQ ID NO : 9 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 02 amino acids 

(B) TYPE: amino acid 
{ C ) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Phe Ser Tyr Val Val Tyr Asp Leu Thr lie 

5 10 

Ala Phe Cys Leu Tyr Tyr Val Ala Thr His 

15 20 

Tvr Phe His Leu Leu Pro Gly Pro Leu Ser 

25 30 

Phe Arg Gly Met Ala lie Tyr Trp Ala Val 

35 40 

Gin Gly Cys He Leu Thr Gly Val Trp Val 

45 50 

Val Ala Phe Ser Asd Tyr Gin Leu Leu Asp 

55* 60 

Asp He Val Gly Leu He Leu His Ser Ala 

65 70 

Leu Leu Val Pro Tyr Phe Ser Trp Lys Tyr 

75 80 

Ser His Arg Arg His His Ser Asn Thr Gly 

85 90 

Ser Leu Glu Arg Asp Glu Val Phe Val Pro 

95 100 

Lys Val Ser Lys Tyr Leu Asn Asn Pro Pro 

105 110 

Gly Arg Val Leu Thr Leu Ala Val Thr Leu 
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Thr Leu Gly Trp Pro Leu Tyr Leu Ala Leu 

125 130 

Asn Val Ser Gly Arg Pro Tyr Asp Arg Phe 

135 140 

Ala Cys His Tyr Asp Pro Tyr Gly Pro He 

145 150 

Tvr Ser Val He Ser Asp Ala Gly Val Leu 
' 155 

Ala Val Val Tyr Gly Leu Phe Arg Leu Ala 

165 170 

Met Ala Lys Gly Leu Ala Trp Val Val Cys 

175 180 

Val Tyr Gly Val Pro Leu Leu Val Val Asn 

185 

Gly Phe Leu Val Leu He Thr Phe Leu Gin 

2.95 " 200 

His Thr His val Ser Glu Trp Asp Trp Leu 

205 210 

Arg Gly Ala Leu Ala Thr Val Asp Arg Asp 

215 220 

Tvr Gly He Leu Asn Lys Val Phe His Asn 
' 225 230 

He Thr ASP Thr His Val Ala His His Leu 

235 240 

Phe ser Thr Met Pro His Tyr His Ala Met 

245 250 

Glu Ala Thr Val Glu Tyr Tyr Arg Phe Asp 

255 260 

Glu Thr Pro Phe Val Lys Ala Met Trp Arg 

265 270 

Glu Ala Arg Glu Cys He Tyr Val Glu Pro 



275 



ASP Gin ser Thr Glu Ser Lys Gly Val Phe 

285 
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Trp Tyr Asn Asn Lys Leu Ala Met Glu Ala 

295 300 

Thr Val 



(2) INFORMATION FOR SEQ ID NO: 10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 372 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Met Gly Ala Gly Gly Arg Met Thr Glu Lys 

5 10 

Glu Arg Glu Lys Gin Glu Gin Leu Ala Arg 

15 20 

Ala Thr Gly Gly Ala Ala Met Gin Arg Ser 

25 30 

Pro Val Glu Lys Pro Pro Phe Thr Leu Gly 

35 40 

Gin lie Lys Lys Ala lie Pro Pro His Cys 

45 50 

Phe Glu Arg Ser Val Leu Lys Ser Phe Ser 

55 60 

Tyr Val Val His Asp Leu Val lie Ala Ala 

65 70 

Ala Leu Leu Tyr Phe Ala Leu Ala lie lie 

75 80 

Pro Ala Leu Pro Ser Pro Leu Arg Tyr Ala 

85 90 

Ala Trp Pro Leu Tyr Trp lie Ala Gin Gly 

95 100 

Ala Phe Ser Asp Tyr Ser Leu Leu Asp Asp 

105 110 

Val Val Gly Leu Val Leu His Ser Ser Leu 

115 120 
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Met Val Pro Tyr Phe Ser Trp Lys Tyr Ser 

125 130 

His Arg Arg His His Ser Asn Thr Gly Ser 

135 140 

Leu Glu Arg Asp Glu Val Phe Val Pro Lys 

145 150 

Lys Lys Glu Ala Leu Pro Trp Tyr Thr Pro 

155 160 

Tyr Val Tyr Asn Asn Pro Val Gly Arg Val 

165 170 

Val His lie Val Val Gin Leu Thr Leu Gly 

175 180 

Trp Pro Leu Tyr Leu Ala Thr Asn Ala Ser 

185 190 

Gly Arg Pro Tyr Pro Arg Phe Ala Cys His 

195 200 

Phe Asp Pro Tyr Gly Pro lie Tyr Asn Asp 

205 210 

Arg Glu Arg Ala Gin lie Phe Val Ser Asp 

215 220 

Ala Gly Val Val Ala Val Ala Phe Gly Leu 

225 230 

Tyr Lys Leu Ala Ala Ala Phe Gly Val Trp 

235 240 

Trp Val Val Arg Val Tyr Ala Val Pro Leu 

245 250 

Leu lie Val Asn Ala Trp Leu Val Leu lie 

255 260 

Thr Tyr Leu Gin His Thr His Pro Ser Leu 

265 270 

Pro His Tyr Asp Ser Ser Glu Trp Asp Trp 

275 280 

Leu Arg Gly Ala Leu Ala Thr Met Asp Arg 

285 290 
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Asp Tyr Gly lie Leu Asn Arg Val Phe His 

295 300 

Asn lie Thr Asp Thr His Val Ala His His 

305 310 

Leu Phe Ser Thr Met Pro His Tyr His Ala 

315 320 

Met Glu Ala Thr Lys Ala lie Arg Pro lie 

325 330 

Leu Gly Asp Tyr Tyr His Phe Asp Pro Thr 

335 340 

Pro Val Ala Lys Ala Thr Trp Arg Glu Ala 

345 350 

Gly Glu Cys He Tyr Val Glu Pro Glu Asp 

355 360 

Arg Lys Gly Val Phe Trp Tyr Asn Lys Lys 

365 370 

Phe Xaa 



(2) INFORMATION FOR SEQ ID NO: 11 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 224 amino acids 

(B) TYPE: amino acid 
{ C ) STRANDEDNESS : 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Trp Val Met Ala His Asp Cys Gly His His 

5 10 

Ala Phe Ser Asp Tyr Gin Leu Leu Asp Asp 

15 20 

Val Val Gly Leu lie Leu His Ser Cys Leu 

25 3 0 

Leu Val Pro Tyr Phe Ser Trp Lys His Ser 

35 40 

His Arg Arg His His Ser Asn Thr Glv Ser 

45 50 
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Leu Glu Arg Asp Glu Val Phe Val Pro Lys 

55 60 

Lys Lys Ser Ser lie Arg Trp Tyr Ser Lys 

65 70 

Tyr Leu Asn Asn Pro Pro Gly Arg lie Met 

75 80 

Thr lie Ala Val Thr Leu Ser Leu Gly Trp 

85 90 

Pro Leu Tyr Leu Ala Phe Asn Val Ser Gly 

95 100 

Arg Pro Tyr Asp Arg Phe Ala Cys His Tyr 

105 110 

Asp Pro Tyr Gly Pro lie Tyr Asn Asp Arg 

115 120 

Glu Arg lie Glu lie Phe lie Ser Asp Ala 

125 130 

Gly Val Leu Ala Val Thr Phe Gly Leu Tyr 

135 140 

Gin Leu Ala lie Ala Lys Gly Leu Ala Trp 

145 150 

Val Val Cys Val Tyr Gly Val Pro Leu Leu 

155 160 

Val Val Asn Ser Phe Leu Val Leu lie Thr 

165 170 

Phe Leu Gin His Thr His Pro Ala Leu Pro 

175 180 

His Tyr Asp Ser Ser Glu Trp Asp Trp Leu 

185 190 

Arg Gly Ala Leu Ala Thr Val Asp Arg Asp 

195 200 

Tyr Gly lie Leu Asn Lys Val Phe His Asn 

205 210 

lie Thr Asp Thr Gin Val Ala His His Leu 

215 220 
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Phe Thr Met Pro 



(2) INFORMATION FOR SEQ ID NO: 12 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 nucleotides 

(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
GCTCTTTTGT GCGCTCATTC 20 



(2) INFORMATION FOR SEQ ID NO: 13 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 nucleotides 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
CGGTACCAGA AAACGCCTTG 20 
<2) INFORMATION FOR SEQ ID NO: 14 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 nucleotides 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 14 : 
TAYWSNCAYM GNMGNCAYCA 2 0 

(2) INFORMATION FOR SEQ ID NO: 15 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 nucleotides 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 i near 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 



RTGRTGNGCN ACRTGNGTRT C 



21 
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WHAT IS CLAIMED IS: 

1 , A method of altering an amount of an 
unsaturated fatty acid in a seed of a plant 
comprising: decreasing a fatty acid desaturase 
activity in the seed by genetic manipulation of at 
least one of fatty acid desaturase or fatty acid 
hydroxylase . 

2. The method of Claim 1, wherein an 
endogenous gene for said fatty acid hydroxylase is 
mutated and thereby decreases fatty acid hydroxylase 
activity in the seed. 

3. The method of Claim 1, wherein said plant 
is transformed with a nucleic acid containing a 
sequence which encodes a fatty acid hydroxylase or 
derivative thereof . 

4. The method of Claim 3, wherein said 
derivative is a dominant negative mutant which 
thereby alters the amount of the unsaturated fatty 
acid in the seed. 

5. The method of Claim 3, wherein said 
derivative is a mutant fatty acid hydroxylase in 
which, one or more essential histidine residues have 
been mutated which thereby alters the amount of the 
unsaturated fatty acid in the seed. 

6. The method of Claim 1, wherein an 
endogenous gene for said fatty acid desaturase is 
mutated and thereby decreases fatty acid desaturase 
activity in the seed. 
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7. The method of Claim 1, wherein said plant 
is transformed with a nucleic acid containing a 
sequence which encodes a fatty acid desaturase or 
derivative thereof . 

8. The method of Claim 1, wherein said 
derivative is a dominant negative mutant which 
thereby alters the amount of the unsaturated fatty 
acid in the seed. 

9. The method of Claim 7, wherein said 
derivative is a mutant fatty acid desaturase in 
which one or more essential histidine residues have 
been mutated which thereby alters the amount of the 
unsaturated fatty acid in the seed. 

10. The method of Claim 1^ wherein said plant 
is selected from the group consisting of rapeseed, 
CraiohG, Bretsslcei juncea., canola, flax, sunflower, 
saf flower, cotton, cuphea, soybean, peanut, coconut, 
oil palm and corn. 

11. A method of altering an amount of a 
unsaturated fatty acid comprising: 

(a) transforming a plant cell with a nucleic 
acid containing a sequence which encodes a fatty 
acid hydroxylase or a dominant negative mutant of 
fatty acid hydroxylase or a dominant negative mutant 
of fatty acid desaturase, 

(b) growing a seed-bearing plant from the 
transformed plauit cell of step (a) , and 

(c) identifying a seed from the plant of step 
(b) with the altered amount of the unsaturated fatty 
acid in the seed. 
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12. The method of Claim 11, wherein said 
nucleic acid contains a sequence which encodes the 
dominant negative mutant of fatty acid hydroxylase 
in which one or more essential histidine residues 
have been mutated. 

13. The method of Claim 11, wherein said 
nucleic acid contains a sequence which encodes the 
dominant negative mutant of fatty acid hydroxylase 
which thereby alters the amount of the unsaturated 
fatty acid in the seed. 

14. The method of Claim 11, wherein said 
nucleic acid contains a sequence which encodes the 
dominant negative mutant of fatty acid desaturase in 
which one or more essential histidine residues have 
been mutated. 

15. The method of Claim 11, wherein said 
nucleic acid contains a sequence which encodes the 
dominant negative mutant of fatty acid desaturase 
which thereby alters the amount of the unsaturated 
fatty acid in the seed. 

16. The method of Claim 11, wherein said plant 
is selected from the group consisting of rapeseed, 
Cramhe, Brass ica juncea, canola, flax, sunflower, 
saf flower, cotton, cuphea, soybean, peanut, coconut, 
oil palm and corn. 

17. A recombinant nucleic acid suitable for 
use in Claim 1, wherein said nucleic acid contains a 
sequence encoding a fatty acid hydroxylase with an 
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amino acid identity of 60% or greater to SEQ ID 
NO:4 . 

18. The recombinant nucleic acid of Claim 17, 
wherein the amino acid identity is 90% or greater to 
SEQ ID NO: 4. 

19. The recombinant nucleic acid of Claim 17, 
wherein the amino acid identity is 10 0% of SEQ ID 
NO:4 . 

20. The recombinant nucleic acid of Claim 17, 
wherein said nucleic acid contains a sequence having 
a nucleotide identity of 90% or greater to SEQ ID 

NO : 1 , 2 or 3 . 

21. The recombinant nucleic acid of Claim 17, 
wherein said nucleic acid contains SEQ ID NO:l, 2 or 
3 - 

22. The recombinant nucleic acid of Claim 17, 
wherein said sequence is obtainable from a plant 
species producing a hydroxylated fatty acid. 

23 . A recombinant nucleic acid suitable for 
use in Claim 1, wherein said nucleic acid contains a 
sequence encoding at least one of fatty acid 
desaturase or fatty acid hydroxylase. 

24. The recombinant nucleic acid of Claim 23, 
wherein said sequence is obtainable from Riclnus 
communis (L.) (castor). 
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25. The recombinant nucleic acid of Claim 23, 
wherein said secjuence is obtainable from Lesc[uerella 
fendleri . 

26. The recombinant nucleic acid of Claim 23, 
wherein said nucleic acid contains a sequence 
encoding at least one of fatty acid desaturase or 
fatty acid hydroxylase in which one or more 
essential histidine residues have been mutated. 

27. The method of Claim 1 further comprising: 
processing the seed containing the altered amount of 
the unsaturated fatty acid to obtain oil and/or seed 
meal . 

28. Oil obtained by the method of Claim 27. 

29. Seed meal obtained by the method of Claim 

27 - 

30. Plant obtained by the method of Claim 1. 

31. The method of Claim 11 further comprising: 
processing the seed containing the altered amount of 
the unsaturated fatty acid to obtain oil and/or seed 
meal . 

32- Oil obtained by the method of Claim 31. 
33, Seed meal obtained by the method of Claim 

31 . 



34. Plant obtained by the method of Claim 11. 
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Ion #1: Mass 187 |cH3-(CH2)5-CH-0-Si-(CH3)^ * 
Ion #2: Mass 299 

O 

j(CH3)3-Si-0-CH-CH2-CH=CH-(CH2)7-C-O-CHjj 

Ion #3: Mass 270 (characteristic rearrangement ion) 

[cH2-CH=CH-(CH2)7-C-0-ChJ] 

O 

Si-(CH3)3 

Ion #4: Mass 185 (desaturated analog of Ion #1) 

jcH3-(CH2)2-CH=CH-CH2-CH-0-Si-(CH3)^ * 

Ion #5: Mass 298 (elongated analog of ion #3) 

[CH2-CH=CH(CH2)9-C-0-CH3^ * 

O 

Si-(CH3)3 
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Figure 4A Mass spectrum of peak 10 from figure 3B 
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Phe I iG Leu Gly Trp Pro 
TTT ATC CTC GGG T6G CCT 

Pro Tyr Asp Gly Phe Ala 
CCT TAT GAT GGT TTC 6CT 

Lys Asp Arg Glu Arg Leu 
AAA GAC CGA GAA C6C CTC 
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GCT GTC T6T TAT GGT CTT 
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260 270 280 290 300 

LFFAH12 251 YAASOGLTAM ICVYGVPLLI VNFFLVLVTF LQHTHPSLPH YDSTEWEWIR 300 

FAH12 251 ATNAKGLAWV MRIYGVPLLI VNCFLVMITY LOHTHPAIPR YGSSEWDWLR 300 

ATFA02 251 YAAA06MASM ICLYGVPLLI VNAFLVLITY LQHTHPSLPH YDSSEWDWLR 300 

BNFA02 251 YAAVOGVASM VCFLRVPLLI VNGFLVLITY LQHTHPSLPH YDSSEWDWLR 300 

GMFA02-1 251 VATLK6LVWL LCVY6VPLLI VNGFLVTITY LGHTHFALPH YOSSEWDWLK 300 

6MFAD2-2 251 LAMAKGLAWV VCVYGVPLLV VN6FLVLITF LOHTHPALPH YTSSEWDWLR 300 

ZMFA02 251 LAAAFGVWWV VRVYAVPLLI VNAWLVLITY LQHTHPSLPH YDSSEWDWLR 300 

RCFA02 251 LAIAK6LAWV VCVYGVPLLV VNSFLVLITF LOHTHPALPH YDSSEWDWLR 300 



310 320 330 340 350 

LFFAH12 301 GALVTVDRDY GILNKVFHNI TDTHVAHHLF ATIPHYNAME ATEAIKPILG 350 

FAH12 301 6AMVTVDRDY GVLNKVFHNI ADTHVAHHLF ATVPHYHAME ATKAIKPIMG 350 

ATFAD2 301 GALATVORDY GILNKVFHNI TDTHVAHHLF STMPHYNAME ATKAIKPILG 350 

BNFAD2 301 GALATVORDY 6ILN0GFHNI TOTHEAHHLF STMPHYHAME ATKAIKPILG 350 

GMFAD2-1 301 GALATMDRDY GILNKVFHHI TDTHVAHHLF STMPHYHAME ATNAIKPILG 350 

GMFA02-2 301 GALATVORDY GILNKVFHNI TDTHVAHHLF STMPHYHAME ATKAIKPILG 350 

ZMFA02 301 GALATMDRDY GILNRVFHNI TDTHVAHHLF STMPHYHAME ATKAIRPILG 350 

RCFAD2 301 GALATVORDY GILNKVFHNI TOTQVAHHLF 350 

360 370 380 390 400 

LFFAH12 351 DYYHFDGTPW YVAMYREAKE CLYVEPOTER 6KKGVYYYNN K-L 400 

FAH12 351 EYYRYDGTPF YKALWREAKE CLFVEPOEGA PTOGVFWYRN KY- 400 

ATFAD2 351 DYYQFOGTPW YVAMYREAKE CIYVEPOREG DKKGVYWYNN K-L 400 

BNFAD2 351 EYYQFOGTPV VKAMWREAKE CIYVEPOROG EKKGVFWYNN KL* 400 

GMFA02-1 351 EYYQFDOTPF YKALWREARE CLYVEPOEGT SEKGVYWYRN KY- 400 

6MFAD2-2 351 EYYRFDETPF VKAMWREARE CIYVEPDQST ESKGVFWYNN KL- 400 

ZMFAD2 351 DYYHFDPTPV AKATWREAGE CIYVEPE--- DRKGVFWYNK KF* 400 
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Piasmid name: pSLJ44026 
Plasmid size: 25.70 kb 
Constructed by; Jonathon Jones 
Construction date: t9 92 

Comments/References: Transgenic Research 1,285-297 (1992) 
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ALTERED LINOLENIC AND LINOLEIC ACID 

CONTENT IN PLANTS 
This is a continuation-in-part of U.S. Serial No. 08/156,551 
filed November 22, 1993, which is a continuation of U.S. Serial No. 
5 08/014,431, filed on February 5, 1993. The present invention relates to 
genetically engineered plants. In particular it relates to genetically 
engineered plants and seeds which have altered Unolenic and linoleic acid 
content compared with naturally occurring plants. 
BACKGROUND 

10 Many crop species produce seed oils in which the fatty acid 

composition is not ideally suited to the intended use. The application of 
conventional breeding methods, coupled in some cases with mutagenesis, 
has resulted in the production of new varieties of several species with 
desirable alterations in the fatty acid composition of seed oil. A notable 

15 example is the development of low erucic acid varieties of rapeseed 
(Stefansson 1983). Similar efforts have resulted in the reduction of the 
level of poljomsaturated 18-carbon fatty acids in soybean (Wilcox and 
Gavins 1985; Graef et al. 1988), simflower (Fick 1989), and linseed oils 
(Green and Marshal 1984). 

20 Most of the genetic variation in seed lipid fatty acid 

composition appears to involve the presence of an allele of a gene that 
disrupts normal fatty acid metabolism and leads to an accumulation of 
intermediate fatty acid products in the seed storage lipids (Downey 1987). 
However, it seems likely that, because of the inherent limitations of this 

25 approach, many other desirable changes in seed oil fatty acid composition 
may require the directed application of genetic engineering methods. 

a-Linolenic acid (18:3^^*i2a5) is an eighteen carbon fatty acid 
containing three cis double bonds at the 9-10, 12-13 and 15-16 carbons. It 
is found in the cells of higher plants as a constituent of cell membranes. It 
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is also found in storage organs, such as in seeds. There it is designated oil 
bodies which are bounded by an electron dense structure that is thought to 
be a half-unit membrane and dispersed in the c3^oplasmic environment of 
cells. When present as a constituent of cell membranes, linolenic acid is 
5 usually esterified to the sn-1 or sn-2 position of the glycerol moiety of a 
diacyl-glycerolipid. By contrast, when present in oil bodies, linolenic acid is 
tusually esterified to the sn-l, sn-2 or sn-3 position of a triacylglycerolipid 
(TAG). 

Linolenic add is extensively used in the paint and varnish 
10 industry in view of its rapid oxidation. Flax seed is a predominant source of 
this oil. Soybean seed, on the other hand, does not have sufficient linolenic 
add content to be used in this industzy. Thus, increasing the linolenic add 
content in a plant such as soybean would permit the use of the soybean oil 
in the paint and varnish industry. 
15 On the other hand, it is undesirable to have significant levels 

of linolenic add in cooking oils and foods. Linolenic add is unstable during 
cooking and is rapidly oxidized. The oxidized products impart randdity to 
the finished product. A rapeseed or soybean oil with reduced linolenic add, 
such as containing 2% or less of linolenic add, would be ideal for use as a 
20 cooking oil. 

Linolenic acid is also a precursor in the biosynthesis of 
jasmonic acid, an important plant growth regulator. Linolenic acid is 
converted to jasmonic acid by introduction of an oxygen to the carbon chain 
by a lipoxygenase, followed by dehydration, reduction, and several 
25 oxidations (Vick and Zimmerman, 1984). The activity of jasmonic add has 
been measured in terms of induction of pathogen defense responses. By 
application of fi'ee hnolenic add to plants, plant pathogen defenses can also 
be induced (Farmer and Ryan, 1992). 
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A model has been proposed to explain the ability of free 
linolenic acid to exhibit the effects associated with jasmonic acid (Fisirmer 
and Ryan, 1992). It is hypothesized that all of the enz3miatic activities 
which are required for the conversion of Unolenic add to jasmonic acid are 
5 constitutively present in the cell and the rate limiting step in the production 
of jasmonic acid is the availability of free linolenic acid. A likely route for 
the production of the free linolenic add is by the activity of a lipase in the 
plasma membrane. 

It has been observed that exogenous jasmonic acid can more 

10 powerfully activate defense responses than can wounding. This suggests 
that woxmds cannot generate enough free linolenic add to support high level 
production of jasmonic add. The activity of the lipase or the availability of 
appropriate substrate for the lipase may be rate limiting upon woimding. 
Thus, increasing the linolenic add content of plasma membrane may 

15 positively influence ''signal transduction'' in plants and result in better 
protection against environment and pathogen stress. 

Linolenic acid, as well as oleic and linoleic adds are also 
important constituents, as well as precursors of volatile carbonyl 
compotmds, whic contribute to the aroma of both fresh and cooked foods. 

20 The major fatty adds of tomato fruit pericarp are oleic, linoleic and linolenic 
acids. As the fruit ripens, the levels of the latter two fatty acids decline 
resulting in the production of a number of 4-6 carbon containing aldehydees 
and ketones. One particular metabolite, c£s -3-hexanol, has been shown to 
be present in higher levels in vine-ripened tomatoes compared to 

25 supermarket tomatoes or tomatoes stored in refrigerators. It is likely, 
therefore, that the "aroma" of fresh fruits and vegetables can be 
**modulated" by regulation of the content of linolenic and linoleic acids, 
important substrates for the enzyme lipoxygenase and subsequently the 
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hydroperoxide cleaving enzyme, which generates the volatile "aroma" 
compounds. 

From the above, it is clear that the ability to vary the content 
of linolenic acid in plants would be desirable. However, to achieve this 
5 result it is necessary to determine what controls the product of hnolenic 
add in plants. 

A large body of experimental evidence derived from 
radiochemical tracer studies has indicated that a-linolenic acid is 
synthesized by the desaturation of linoleic acid (18:2^»12) (reviewed in 
10 Harwood 1988;). However, the actual substrate for desaturation is not 
known. 

In vivo and in vitro labelling studies suggest that there are 
possibly two distinct pathways for the synthesis of linolenic acid (Browse 
and Somerville, 1991). One possible pathway is thought to be located in the 

15 endoplasmic reticulum where linoleic add esterified to the sn-2 position of 
phosphatidylcholine is a substrate for desaturation. However, the 
available evidence does not exclude the possibility that linoleic acid 
esterified to other lipids may also be a substrate. 

A second possible pathway of linoleic add desaturation is 

20 located in the plastid where the available evidence suggests that linoleic 
add esterified to monogalactosyldiacylglycerol and, possibly, other plastid 
Upids is the substrate for desaturation. 

Relatively little direct information is available concerning the 
enzjrmes involved in linoleic acid desaturation. Low levels of enzjrme 

25 activity have been detected in microsomal membrane preparations from 
developing hnseed (Linimi ussitatum) (Browse and Slack, 1981) and, more 
recently, in preparations of gently lysed chloroplasts (Schmidt and Heinz, 
1990a,b). The general features of the enzyme may be inferred firom 
information available about other enzymes of this class. 
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The most thoroughly characterized desaturase is the stearoyl- 
Coenzjmie A (CoA) desaturase from vertebrate liver (reviewed by 
Holloway, 1983). This enzyme has been shown to be an integral membrane 
protein which contains non-heme iron. The desaturase reaction requires 
5 fatty acyl-CoA, molecular oxygen and reduced cytochrome b5, another 
membrane protein. In vivo, the reduced cytochrome b5 is produced by the 
transfer of reducing equivalents from NADH via the activity of 
cjrtochrome b5 reductase, a flavin containing membrane protein. 

The most thoroughly characterized desaturase from plants is 

10 the stearoyl-ACP desaturase (McKeon and Stumpf, 1982; Shanklin and 
Somerville, 1991). This enzyme also requires molecular 03^gen and a high 
potential reductant. However, in contrast to the animal enzyme, this 
desaturase is a soluble plastid protein which preferentially acts on a fatty 
add esterified to acyl carrier protein (ACP) rather than CoA. This enzyme 

15 also differs from the animal enzyme by utilizing reduced ferredoxin as an 
intermediate electron donor. 

Other plant desatiu-ases appear to be membrane proteins. 
The microsomal A12 oleate desaturase from several plant species has been 
assayed in membrane preparations from several plants (Harwood, 1988). 

20 As with the stearoyl-CoA desaturase from animals, this enzyme requires 
molecular oxygen and reduced cytochrome b5 as an electron donor (Keams 
et al., 1991). However, it appears that oleate esterified to a phosphoUpid is 
the substrate rather than a CoA ester. 

With regard to the activity responsible for the making of 

25 linolenic acid, little was known as to its source or origin. However, evidence 
that the amount of linolenic acid is related to the amoimt of linoleic acid 
desaturase activity has been obtained by analysis of the properties of the 
fads mutant of Arabidopsis thaliana (Lemieux et al. 1990). This mutant is 
deficient in linolenic acid in the storage oils of its seed lipids and in the 
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membrane lipids of dififerent tissues to varying degrees. The mutant also 
had an increase in the amoimt of linoleic acid. This can be interpreted as 
evidence that the mutant is defective in the activity of a desaturase which 
converts linoleic add to linolenic acid. 
5 There is further evidence to suggest that the activity of this 

desaturase could be rate Umiting for Unolenic acid S3aithesis under normal 
circumstances. This was discovered by measuring the effects on fatty acid 
composition in heterozygous plants (i.e., fad3+/fad-) formed by crossing the 
wild type with the fadS mutant. In these Fl plants, which have one copy of 
10 the normal fad3 gene product instead of the two normally foxmd in the wild 
type, the amount of linolenic add was almost exactly intermediate between 
that found in either parent. This suggests that the amount of linolenic add 
is proportional to the amount of functional fadS gene product (Lemieux et 
al., 1990). 

15 These results do not shed any lig^t, however, on the nature of 

the fads gene product or whether the observed efifects in mutants are 
related to either a decrease in quantitiy of desaturase protein or desaturase 
activity due to a defective protein. 

Moreover, nothing is known with any degree of certainty 

20 about the linoleic acid desaturase from plant microsomes. As noted above, 
very little is known about the microsomal desaturases except that they 
probably utihze reduced cytochrome b5 as intermediate electron donor and 
probably utihze Upids rather than CoA or ACP esters as substrates. 

Moreover, as in many other aspects of plant biology, the lack 

25 of specific information about the biochemistry and regulation of lipid 
metabohsm makes it difficult to predict how the introduction of one or a few 
genes might usefully alter seed lipid sjmthesis. 

An additional problem arises from the fact that many of the 
key enzymes of lipid metabohsm are membrane-bound and present in low 
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quantities. Thus, attempts to solubilize and purify them from plant 
sources have not been successful. 
SUMMARY OF THE INVENTION 

The present invention provides structural coding sequences 
5 encoding linoleic acid desaturase activity which can be used to alter the 
linoleic sind linolenic acid compositions of plants or to isolate other plant 
linoleic acid desaturases. The present invention farther provides a plant 
capable of expressing a structural coding sequence to control the level of 
linolenic add or linoleic acid or both in the plant. The present invention 
10 further provides a method for controlling the levels of hnoleic and linolenic 
acid in plants. It is also demonstrated by the present invention that the 
linoleic acid desaturase enzyme activity in plant cells and tissues is a 
controlling step in linolenic acid biosynthesis. 

The present invention further relates to the engineering of two 
15 advantageous traits into plants: increased and decreased a-linolenic add 
content in the structural lipids or storage oils of various crop plants. 

In accomplishing the foregoing, there is provided, in 
accordance with one aspect of the present invention, a genetically 
transformed plant which has an elevated linolenic add content comprising 
20 a recombinant, double-stranded DNA molecule comprising 

(i) a promoter that functions in plant cells to cause 
the production of an RNA sequence, said promoter 
operably linked to; 

(ii) a structural coding sequence that causes the 
25 production of an RNA sequence that encodes a linoleic 

acid desaturase activity; and 

(iii) a 3' non-translated region that functions in plant 
cells to promote polyadenylation to the 3* end of said RNA 
sequence. 
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In accordance with another aspect of the present invention, 
there is provided a genetically transformed plant which has a reduced 
linolenic acid content, comprising a recombinant, double-stranded DNA 
molecule comprising 
5 (i) a promoter that functions in plant cells to cause 

the production of an RNA sequence, said promoter 
operably linked to; 

(ii) a DNA sequence that causes the production of an 
RNA sequence that is in antisense orientation to at least 
10 a portion of a gene that encodes a linoleic add desaturase 

activity in said plant; and 

6ii) a 3' non-translated region that functions in plant 
cells to promote polyadenylation to the 3' end of said KNA 
sequence. 

15 There has also been provided, in accordance with another aspect 

of the present invention a method of producing a genetically transformed 
plant which has an elevated or reduced Unolenic acid content. There has 
also been provided, in accordance with another aspect of the present 
invention a recombinant, double-stranded DNA molecule and plant cells 

20 containing a recombinant, double-stranded DNA molecule. 
RRTRF DESCRIPTION OF THE DRAWINGS 

Figure 1 shows the genetic map of the region of chromosome 2 of 
Arabidopsis thaliana where a linoleic acid desaturase gene is located and 
the identity of the yeast artificial chromosomes which carry this region of 

25 the genome. 

Figure 2 shows the structure of plasmid pBNDESS which was 
obtained by inserting an EcoRI fragment containing the B. napus linoleic 
add desaturase cDNA (fadS) into pBLUESCRIPT. 
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Figure 3 shows the nucleotide sequence (SEQ ID NO:l) and 
deduced amino acid sequence (SEQ ID NO:2) for the linoleic add desaturase 
cDNA (fadS) from B. napus. 

Figure 4 shows a comparison of the deduced amino add sequence 
5 of one linoleic acid desaturase cDNA (fadS) from B. napus and the desA 
gene from Synechocystis, Identical residues are indicated by a solid box. 
Conservative substitutions are indicated by a stippled box. 

Figure 5 shows the binary Ti plasmid vector pBI121. 

Figure 6 shows the binary Ti plasmid pTiDESS which was 
10 constructed by insertion of a linoleic add desaturase cDNA (fadS) into 
pBI121. 

Figure 7 shows the map of the plant transformation vector 
pMON13804. 

Figure 8 shows the map of the plant transformation vector 
15 pMON13805. 

Figure 9 shows the oil content of control and transformed canola 
seed in accordance with the present invention. 

Figure 10 shows the nucleotide sequence (SEQ ID NO:9) for the 
linoleic add desaturase cDNA (fadD) fromArabidopsis. 
20 Figure 11 shows the deduced amino acid sequence (SEQ ID 

NO: 10) for the linoleic acid desaturase cDNA (fadD) from Arabidopsis. 

Figure 12 shows the nucleotide sequence (SEQ ID NOrll) for the 
Unoleic acid desaturase cDNA (fadE) from Arabidopsis, 

Figure 13 shows the deduced amino add sequence (SEQ ID 
25 NO:12) for the linoleic add desaturase cDNA (fadE) fromArabidopsis, 
DETAILED DESCRIPTION OF THE INVENTION 

A genetically transformed plant of the present invention which 
has an altered linolenic or linoleic acid content can be obtained by 
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expressing the double-stranded DNA molecules described in this 
application. 

The expression of a double-stranded DNA involves transcription 
of messenger RNA (mRNA) from one strand of the DNA by RNA 
5 polymerase enzyme, and the subsequent processing of the mRNA primary 
transcript inside the nucleus. This processing involves a 3* non-translated 
region which adds polyadenylate nucleotides to the 3' end of the RNA. 
Promoters 

Transcription of DNA into mRNA is regulated by a region of 
10 DNA usually referred to as the "promoter." The promoter region contains a 
sequence of bases that signals RNA polymerase to associate with the 
DNA, and to initiate the transcription of mRNA using one of the DNA 
strands as a template to make a corresponding complementaiy strand of 
RNA. 

15 Any promoter which is known or is found to cause transcription 

of RNA in plant cells can be used in the present invention. Promoters 
which are useful in the present invention include any promoter that 
functions in a plant cell to cause the production of a RNA sequence. A 
number of promoters which are active in plant cells and are capable of 

20 producing a RNA sequence have been described in the literature. These 
include the nopaline synthase (NOS) and octopine synthase (OCS) 
promoters (which are carried on tumor-inducing plasmids of Agrobacterium 
tumefaciens), the caxilimovirus promoters such as the cauliflower mosaic 
virus (CaMV) 19S and 35S and the figwort mosaic virus 35S-promoters, 

25 the light-inducible promoter from the small subimit of ribulose-l,5-bis- 
phosphate carboxylase (ssRUBISCO, a very abundant plant polypeptide), 
and the chlorophyll a/b binding protein gene promoter, etc. All of these 
promoters have been used to create variotis types of DNA constructs 
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which have been expressed in plants; see, e.g., PCT publication WO 
84/02913 (Rogers et al., Monsanto). 

Promoters may be obtained from a variety of sources such as 
plants and plant viruses. Promoters can be used in the form that they 
5 exist as isolated from plant genes such as ssRUBISCO genes, or can be 
modified to improve their effectiveness, such as with the enhanced 
CaMV35S promoter. 

Those skilled in the art will recognize that the amount of linoleic 
acid desaturase needed to induce the desired alteration in linolenic add 

10 content may vary with the type of plant. It is also possible that extremes 
in linoleic add desaturase activity may be deleterious to the plant. 
Therefore, in a preferred embodiment, promoter function should be 
optimized by selecting a promoter with the desired tissue expression 
capabilities and approximate promoter strength and selecting a 

1 5 transformant which produces the desired linoleic add desaturase activity in 
the target tissues. 

This selection approach from the pool of transformants is 
routinely employed in expression of heterologous structural genes in plants 
since there is variation between transformants containing the same 

20 heterologous gene due to the site of gene insertion within the plant genome. 
(Commonly referred to as "position effect"). 

In a preferred embodiment, the promoters utilized in the double- 
stranded DNA molecules shoxild have relatively high expression in tissues 
where the increased or decreased linolenic acid content is desired, such as 

25 the seeds of the plant. In Canola, a particularly preferred promoter in this 
regard is the seed specific promoter described herein in greater detail in the 

accompanjdng examples. 

In another preferred embodiment, the promoter used in the 
expression of the double-stranded DNA molecules of the present invention 
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can be a constitutive promoter, expressing the DNA molecule in all or most 
of the tissues of the plant. However, the promoter selected for this 
embodiments shoxild not cause expression at levels which are detrimental 
to plant health, growth and development. 
5 B-conglycinin (also known as the 7S protein) is one of the major 

storage proteins in soybean (Glycine max) (Meinke et al,, 1981). The 7S (p- 
conglydn) a'-subunit promoter, used in one aspect of this study to express 
the linoleic add desaturase gene, has been shown to be both highly active 
and seed-specific (Doyle et al, 1986 and Beachy et al., 1985). The fi-subunit 

10 of B-conglydnin has been expressed, using its endogenous promoter, in the 
seeds of transgenic petunia and tobacco, showing that the promoter 
functions in a seed-specific manner in other plants (Bray et al., 1987). The 
promoter for fi-conglydnin could be used to in accordance with the present 
invention. If used, this promoter could express the DNA molecule 

15 specifically in seeds, which could lead to an alteration in the linolenic add 
content of the seeds. 

In addition, the endogenous plant linoleic acid desaturase 
promoters can be used in the present invention. These promoters should be 
usefiil in expressing a linoleic acid desaturase gene in specific tissues, such 

20 as leaves, seeds or fruits. A number of other promoters with seed-specific 
or seed-enhanced expression are known and are likely to be expressed in 
seeds, which are oil accimiulating cells. For illustration, the napin promoter 
and the acyl carrier protein promoters have been utilized in the 
modification of seed oil by antisense expression (Knutson et al., 1992). 

25 The linolenic acid content of root tissue can be increased by 

expressing a linoleic acid desaturase gene behind a promoter which is 
expressed in roots. The promoter from the acid chitinase gene (Samac et 
al., 1990) is known to function in root tissue and could be used to express 
the linoleic acid desaturase in root tissue. Expression in root tissue could 
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also be accomplished by utilizing the root specific subdomains of the 
CaMV35S promoter that have been identified. (Benfey et aL, 1989): The 
linolenic acid content of leaf tissue can be increased by expressing the 
linoleic acid desaturase gene using a leaf active promoter such as 
5 ssRUBISCO promoter or chlorophyll a/b binding protein gene promoter. 

The linolenic acid content of fruits can be increased by 
expressing a linolenic acid desaturase gene behind a promoter which is 
fimctional in fruits. Such promoters could be either expressed at all 
developmental stages of the firuit or restricted to specific stages, 

10 particularly fruit ripening. 

The RNA produced by a DNA construct of the present invention 
can also contain a 5' non-translated leader sequence. This sequence can be 
derived from the promoter selected to express the gene, and can be 
specifically modified so as to increase translation of the mRNA. The 5* 

15 non-translated regions can also be obtained fix>m viral RNAs, from suitable 
eukaryotic genes, or from a synthetic gene sequence. The present 
invention is not limited to constructs, as presented in the following 
examples, wherein the non-translated region is derived from the 5' non- 
translated sequence that accompanies the promoter sequence. Rather, the 

20 non-translated leader sequence can be derived firom an unrelated promoter 
or coding sequence as discussed above. 
Linoleic Acid Desaturase S tructural Coding Sequences 

The structiu-al coding sequence that causes the production of an 
RNA sequence that encodes a linoleic acid desaturase activity can be the 

25 sequences disclosed in the present application, or any sequence that can be 
obtained using the sequences disclosed in the present application, or any 
sequence that can be isolated using the method disclosed in the present 
application. 
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The structural coding sequence can also be a part of or from the 
structural coding sequences disclosed in the present invention. It is possible 
that the active part of the linoleic add desaturase is formed using only part 
of the structural coding sequences disclosed in the present application. 
5 The structural coding sequences can be obtained from a variety 

of sources, such as algae, bacteria or plants. Preferably, structural coding 
sequences obtained from plants are used in accordance with the present 
invention. 

Since virtually nothing was known about the properties of the 

10 linoleic acid desatiirase structural coding sequence prior to the present 
invention, the method used in the present invention to isolate the structural 
coding sequence was based on the concept of map based cloning. The 
essential concept in map based cloning is to use information about the 
genetic map position of a structural coding sequence to isolate the region of 

15 the chromosome surroimding the structural coding sequence, and then to 
use the isolated DNA to complement a mutation in the structural coding 
sequence. This strategy has never previously been reported in the isolation 
of any plant gene. 

In order to implement map based cloning of the linoleic acid 

20 desaturase, mutants of Arabidopsis thaliana (L.) deficient in Unoleic acid 
desatiirase activity were isolated by screening randomly chosen individuals 
from mutagenized populations of plants for individual plants with altered 
leaf or seed fatty acid composition. (Browse et al. 1985; Lemieux et al. 
1990). By screening thousands of plants for altered fatty acid composition, 

25 mutants with decreased amounts of linolenic acid and increased amounts of 
linoleic add in leaf and seed lipids were isolated. Physiological and genetic 
analyses of these mutants indicated that they fell into three 
complementation groups designated fadS, fadD and fadE. 
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The fads mutants had very reduced levels of linolenic acid in 
seeds and roots but had almost normal levels of linolenic acid in leaves. 
This effect was interjpreted as evidence that the fadS locus encoded a 
microsomal desaturase which was responsible for desaturation of linoleic 
5 acid to linolenic acid on lipids made by the pathway of hpid biosynthesis in 
the endoplasmic reticulum, designated the "eukaryotic pathway" (Lemieux 
et al. 1990). This pathway is mostly responsible for the synthesis of lipids 
in non-green tissues such as seeds and roots, but plays a secondary role in 
leaves and other green tissues. Thus, a mutation in the fadS gene would not 

10 be expected to have a msgor effect on the desaturation of leaf lipids. 

In contrast to the fadS mutant, the fadD mutant had almost 
normal fatty add composition of roots and seeds, but had a strong 
reduction in the amount of linolenic add in leaf lipids, and a corresponding 
increase in the amount of linoleic acid. (Browse et al., 1986). Thus, this 

15 mutant had the properties e3q>ected of a mutant defident in a linoleic add 
desaturase from the prokaryotic pathway which is primarily responsible 
for the synthesis of lipids in green tissues. 

An unusual property of the fadD mutants was that they were 
very defident in linoleic add content when grown at temperatures above 

20 about 22 'C but had almost normal fatty add composition when grown at 
temperatures below about 18 'C (McCourt et aL, 1987). Since it was very 
imlikely that several independently isolated mutations would all give rise to 
a temperature conditional phenot3rpe, it was concluded that a second 
desaturase must be partially responsible for desaturating linoleic acid to 

25 linolenic acid in green tissues. Therefore, the fadD mutant was 
remutagenized with ethylmethane sidfonate, self-fertilized to produce a 
segregating population of mutagenized plants (designated the M2 
generation), and this population was screened for a mutant which was 
deficient in linolenic acid in green tissues at low temperatures. A mutant 
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with this property was isolated and the mutation responsible for this effect 
was designated the fadE locus (Somerville and Browse, xanpublished). 
IsQlation of the Linoleic Acid Desaturase Gene from Canola 

The following example was used to isolate the structural coding 
5 sequence from the fadS region. The method described herein could equally 
have been used to isolate either the fadD or fadE region. 

In order to approximately locate the fadS mutation of the genetic 
map of ArabidopsiSy a sexual cross was made between the fadS mutant line 
BLl and the multiply marked mutant line Wl (Hugly et al., 1991). The Fl 

10 hybrids from this cross were permitted to self-fertilize and the resulting F2 
plants were scored for both the segregating genetic markers and the altered 
fatty acid composition. The results of this analsrsis indicated that the fadS 
mutation was located on chromosome 2 near the marker erecta. In order 
to obtain a more accurate map position by RFLP mapping, a second sexual 

15 cross was made between the fad3 mutant line BLl and the Niederzenz 
race of Arabidopsis. The Fl progeny were permitted to self-fertilize to 
produce the F2 generation. 137 F2 plants were grown during 3 weeks at 22* 
C (100 ixE/mVs) in order to produce fully expanded rosettes, and a few 
leaves (representing a total weight of 0.2-0,5 g per plant) were harvested 

20 from each plant in order to prepare DNA from them. 

The leaves were frozen in liqxiid nitrogen, and ground in dry ice, 
using a mortar and a pestle. For each sample, the frozen powder was 
transferred to a microfuge tube and an equal amount of 2 X CTAB buffer 
(2% cetyltrimethyl ammoniimi bromide (CTAB), 100 mM Tris-HCl pH 8, 

25 20 mM EDTA, 1.4 M NaCl, 1% polyvinylpolypyrroUdone (EVP) 40,000) was 
added. The tubes were left at room temperatiure for 5 min to allow the 
powder to thaw. The homogenate was extracted once with a mixture of 
chloroform-isoamyl alcohol (24:1, v/v), and 1/10 vol of 10 X CTAB (10 % 
CTAB, 0,7 M NaCl) buffer was added to the iaqueous phase, which was then 
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reextracted with an equal volume of chloroform isoamyl alcohol (24:1, v/v). 
The aqueous phase was transferred to a fresh microfiige tube and 1,5 vol of 
CTAB precipitation buffer (1% CTAB, 50 mM Tris-HCl pH 8, 10 mM 
EDTA) was added. The DNA was allowed to precipitate for 12 hr at 4 
5 degrees, and collected by centrifugation (5 min at 10 OOOg). The DNA was- 
resuspended in 100 ^1 of 10 mM Tris-HCl pH 7.5, 1 mM EDTA, 1 M NaCl, 
and 100 jig/ml RNase A and incubated at 50*C for 30 min. The DNA was 
precipitated by adding 2.2 vol of ethanol and incubating on ice for 20 min. 
The DNA was collected by centrifugation and the pellet was washed once 

10 with 1 ml of 70% ethanol, dried under vacuum for 3 min and resuspended in 
10 id of distilled water. The DNA was stored at -20*C until use. 

The 137 plants were grown to maturity and their seeds were 
collected individually. The fatly acid composition of 10 individual seeds 
from each of the F2 plants was measured as described by Browse et al 

15 (1986) in order to score the fadS phenolype of each plant. Each seed was 
incubated in 1 ml of IN HCl in methanol for Ih at 80'C. The tubes were 
cooled to room temperature and 1 ml of 0.9 % NaCl plus 0.3 ml of hexane 
were added. The tubes were agitated by vortexing and the phases separated 
by centrifugation (SOOxg for 5 min). The hexane phase was saved, 

20 evaporated under a stream of nitrogen, and the fatty acid methyl esters 
were dissolved in 50 \}1 hexane. An ahquot (2 jil) was injected onto the gas 
chromatograph and the fatty acid methyl esters separated and quantitated 
by ilame ionization as described (Browse et al., 1986). 

The DNA samples (1 \ig) were then cut with the appropriate 

25 restriction enzyme (EcoRl for the marker # 220, Bgl2 for the marker 
ASA2) using a concentration of IXKGB buffer (Sambrook et al, 1989), 5 
units of the restriction endonuclease and 100 fig/ml BSA. The volume of 
each sample was 10 ^1 and the incubation was performed at 37 for 4 h. 
The fragments were resolved by agarose gel electrophoresis (0.8 % agarose 
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in IX TAE biififer; Saxnbrock et al., 1989) and transferred to nylon filters 
(hybond using the alkaline transfer method as described by the 
manufacturer. The nylon filters were probed (according to Church and 
Gilbert, 1984) with radioactively labelled firagments of DNA (Sambrock et 
5 al.y 1989) corresponding to known RFLP markers which had previously 
been mapped in the approximate vicinity of the fadS locus on chromosome 
2. The RFLP markers 220 (Chang et al 1988) and ASA2 were found to 
map close to the fadS locus. Analysis of the pattern of recombinants 
(Table 1) indicated that both ASA2 and 220 were located on the same side 
10 of the fads locus at distances of 0.4 and 2*2 centimorgans (cM), 
respectively. 

Table 1 





# of plants 


220 


ASA2 






67 


H 


H 


V- 


15 


30 


L 


L 


-/- 




34 


N 


N 


+/+ 




3 


H 


N 


+✓+ 




1 


L 


H 


+/- 




1 


N 


H 


+/- 


20 


1 


H 


H 


V- 



Table 1 shows the genot3rpe of the F2 plants used for mapping 
the fad 3 locus. L is for Landsberg (background of the fad 3 mutant), N is 
for Niederzenz, H for heterozygous. A total of 137 F2 plants were analyzed. 
25 The number of recombinant plants between fadS and 220 or ASA2 was 6 
and 1 respectively. 

In order to isolate the region of the chromosome containing the 
fad3 locus, the RFLP markers 220 and ASA2 were used as hybridization 
probes to screen several yeast artificial chromosome (YAC) libraries. (Grill 
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and Somerville, 1991; Ward and Jen, 1990). The YAC filters were prepared 
according to Grill and Somerville (1991). The library was replicated onto 
nylon filters disposed on petri dishes of SC — (ssmthetic complete medixmi 
minus tryptophan and uracil; Sherman et al., 1986). The cells were allowed 
5 to grow for 12 h at 30°C, and the filters were transferred for 15 min on a 
Whatman 3MM paper saturated with 1 M sorbitol, 50 mM DTT, 50 mM 
EDTA (pH 8). 

The cell wall of the cells was then digested with lyticase, by 
incubating the filters on a Whatman paper saturated with IM sorbitol, 50 
10 mM EDTA and 2 mg/ml lyticase (Sigma Co., St. Louis,MO) for 12 h at 
30*C. The filters were then transferred on a Whatman 3MM paper 
saturated with 0.5 M NaOH, 1.5 M NaCl for 15 min, neutralized with 0.5 M 
Tris-HCl pH 8 for 15 min and quickly rinsed in 2XSSC (SSC is lOmM 
sodium citrate, 150mM NaCl, pH 7). The filters were allowed to dry, and 
15 were transferred to a vacuum oven at 80*C for 1 h. They were 
subsequently hybridized according to Church and Gilbert (1984), with 
probes labelled with 32p according to Sambrook et al. (1989). 

The DNA of RFLP probe 220 was prepared firom 100 ml of Uquid 
culture lysate using the lambdasorb procedtu-e (Promega Corp., Madison, 
20 WI); the cDNA encoding ASA2 was excised from the original plasmid 
(pKN140C; obtained from Dr. G. Fink, Whitehead Institute, Cambridge, 
MA) with Hinds and cloned into the HindS site of pBLUESCRIPT. The 
plasmid DNA was then purified by Cesitim chloride gradients according to 
Sambrook et al (1989), digested with Hind3 and the DNA insert was gel 
25 piuified twice by electroelution according to Sambrook et al (1989). 

In order to probe the libraries, the whole DNA fi-om RFLP220 
was used as a hybridization probe. By contrast, only the DNA insert of 
ASA2 was used as a probe. The RFLP probe 220 hybridized to YAC 
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EG4E8 and EG9D12. The probe ASA2 hybridized to YACs EW15G1, 
EW15B4 and EW7D11. 

In order to determine if these YACs contained all of the DNA 
between RFLP220 and ASA2, small regions of DNA from the ends of the 
5 inserts in EG4E8 and EW15G1 were prepared by inverse PGR (Grill and 
Somerville, 1991). For that purpose, DNA was prepared from the 
appropriate YAC clones. The clones (single colonies) were grown to 
saturation in SC: — liquid cultures, and 1 ml of these cultures was used to 
inoculate 40 ml liquid cultures (in SO — medium) that were allowed to grow 

10 for 16 h at 30*C. The cdls were collected by centnfugation, washed once in 
1 M sorbitol, 50 mM EDTA, resuspended in 200 ^1 of 1 M sorbitol, 50 mM 
EDTA, 100 mM soditun citrate pH 5.8, 2 mM P-mercaptoethanol and 2 
m^ml Ijrticase, and incubated 2 h at 30 *C. 

Next, 350 \il of 2XCTAB buffer was added and the DNA was 

15 purified as described above. DNA (5 \ig) of each clone was digested 
separately with ffincU, Alul, EcoRV and Rsal (in IXKGB buffer, at 37 'C 
for 4 h; final volume: 50 jil). The reactions were stopped by heating at 65 
*C for 15 min, extracted once with one voliune of phenol saturated with TE 
pH 8, followed by an extraction with 1 volume of chloroform - isoamyl 

20 alcohol mixture (24:1, vol/vol). The DNA was recovered by ethanol 
precipitation and resuspended in sterile distilled water. The ligation 
reactions were performed using 300 ng of DNA in a final volume of 50 jil. 
The reactions were carried out in 50 mM Tris-HCl pH 7.4, 10 mM MgC12, 1 
mM DTT,1.2 mM ATP with 1 U of Ugase, for 2 h at 20 'C, and stopped by 

25 heating at 68 *C for 30 min. 

The PGR reactions were carried out as follows: The buffers used 
were the ones indicated by the suppliers except for the Perkin Elmer 
enzyme for which the reaction was supplemented with an additional 1.4 
mM MgCl2 (final concentration 2.9 mM Mg). The dNTP final concentration 



SUBSTITUTE SHEET (RULE 26) 



wo 94/18337 PCTAJSSM/01321 



-21- 



was 125 |iM when the Perkin Ehner enzyme was used and 200 foM with the 
Taq polymerases from other sources. In all cases, 100 ng of each 
oligonucleotide was used. The final volume was 100 pi. When no product 
was obtained, the reactions were carried out again in the same conditions 
5 except that formamide was added to a final concentration of 3 %. 

The left end was amplified firom the ligation products of the 
EcoRV and Rsal digests, using the oligonucleotides EGl 
(GGCGATGCTGTCGGAATGGACGATA) (SEQ. ID NO. 3) and EG2 
(CTTGGAGCCACTATCGACTACGCGATC) (SEQ. ID NO. 4). 

10 The right end of the clones obtained firom the EG library was 

amplified firom the ligation products of the Alul and HincII digests, using 
the oligonucleotides EG3 (CCGATCTCAAGATTACGGAAT) (SEQ, ID NO- 
5) and EG4 (TTCCTAATGCAGGAGTCGCATAAG) (SEQ. ID NO. 6). 

The right end of the clones obtained fi-om the EW YAC libraxy 

15 was amplified using the oligonucleotides HI (AGGAGTCGCATAAGGGAG) 
(SEQ. ID NO. 7) and H2 (GGGAAGTGAATGGAGAC) (SEQ. ID NO. 8), 
using the same cycle conditions as above, except that the annealing 
temperature was reduced to 50 *C. 

After the reactions were completed, 5^1 of each mixture were 

20 electrophoresed on an agarose gel to separate the amplification product 
firom primers. The slice of agarose that contained the ampUfied band was 
excised firom the gel and melted in 1 ml of distilled water. Large amounts of 
product could then be produced, by reamplification of 5 p.1 of the melted 
slice. The PGR products were then purified by electroelution or by using 

25 GeneClean (BiolOl) and used as hybridization probes to probe filters 
containing the isolated YAC DNA restricted by several enzymes. The 
probe made from the right end of EW15G1 hybridized to EG4E8 and 
similarly, a probe from the right end of EG4E8 hybridized to EW15G1. 
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Thus, it was concluded that the YACs EG4E8 and EW15G1 contained all of 
the DNA in the region of the chromosome between RFLP220 and ASA2. 

The size of the YAC clones was estimated by field inversion 
electrophoresis (CHEF, Vollrath and Davis, 1987). High molecular weight 
5 DNA was prepared as follows: the yeast cells which contained the YAC 
clones were grown and treated with Ijrticase as for preparing DNA as 
described above. The spheroplasts were then resuspended in an equal 
volume of IM sorbitol, 50 mM EDTA, 1 % low melt agarose at 37*C. The 
mixture was poured in a mould (Biorad) which was set on ice to allow the 

10 agarose to harden. 

The resulting plugs were incubated for 12 h in 0.5 M EDTA pH 9, 
1% lauryl sarcosine 1 mg/ml Proteinase K at 50*C. The plugs were 
subsequently washed twice in 50 mM EDTA and stored at 4'C until use. 
The CHEF gel was run in IXTBE for 16 h at 200 V, with a switching 

15 interval of 20 s; the temperature of the buffer was maintained at 14 'C 
dxiring the run. The sizes of the YACs were determined by comparison with 
a lambda ladder and the yeast chromosomes, and were as follows: EG4E8, 
90 kb; EG9D12, 190 kb; EW15G1, 90 kb; EW15B4, 70 kb, EW7D11, 125 
kb. These sizes permitted us to roughly determine a correspondence 

20 between physical and genetic distances: the distance that separates 220 
from ASA2 cannot exceed 180 kb, the sum of the size of the 2 YACs 
EG4E8 and EW15G1. Since the corresponding genetic distance is 1.7 cM, 
one can roughly estimate that, in this particular cross and in this particular 
region of the genome, the value of 1 cM is close to lOOkb. Thus, since the 

25 fad3 gene maps only 0.4 cM away from ASA2, the corresponding physical 
distance should be close to 40 kb. We then concluded that fadS was 
probably located on the YAC EW7D11, which is the largest YAC 
hybridizing with ASA2. See Figure 1. 
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In order to test the possibility that the YAC EW7D11 carried 
the fad3 gene, the YAC was used to probe a cDNA library made from 
developing seeds of Canola {Brassica napus L.). Even thoxigh the YAC was 
isolated from Arabidopsis, the fact that Arabidopsis and B. napus are both 
5 members of the family Cnidferae led us to predict that the homologous 
genes from these two species would be sufficiently identical at the 
nucleotide sequence level so that the Arabidopsis gene would hybridize to 
the B. napus gene. We also assumed that, because it catalyzes a 
chemically similar reaction to the stearoyl-ACP desaturase, it would be 

10 expressed at similar moderately high levels in developing seeds (Shanklin 
and Somerville, 1991). Since EW7D11 contained only about 0.2% of the 
total genome, we expected it to contain only about 2 moderately 
abimdantly expressed genes (i.e., genes in which the mRNA is between 0.1 
and 0.01% of total mRNA). 

15 DNA of YAC EW7D11 was isolated as follows: high molecular 

weight DNA was prepared from the yeast cells that contained the YAC 
EW7D11 as described above, and several preparative low-melt agarose 
CHEF gels were run in IXTBC buffer (same as TBE except that CDTA 
was substituted for EDTA). The shoes that contained the YAC were excised 

20 from the gels and pooled. Three slices were melted at 65'C and extracted 
with an equal voltune of phenol saturated with TE. The aqueous phEise was 
saved and reduced to 0.5 ml by repeated extractions with isobutyl alcohol. 
The remaining agarose was removed by several phenol extractions, followed 
by two chloroform-isoamyl alcohol extractions. The DNA was precipitated 

25 by adding 2 \ig of Unear acrylamide as a carrier plus 10 pi of 5M NaCl and 
1.1 ml of ethanol, and incubating 20 min at 0 'C. The DNA pellet was 
recovered by centrifugation, washed in 70% ethanol, dried imder vacuum 
and resuspended in 50 jil of distilled water. The DNA (50 ng) was 
radioactively labelled and used to probe a cDNA library in Xgtll. 
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The nitrocellulose filters were processed as described in 
Sambrook et al (1989). Duplicate filters were used, and the films were 
exposed 5-7 days in order to obtain a good signal. From among 200,000 
plaques screened in this way, 31 hybridized to EW7D11. Among these 31 
5 clones, 17 were homologous to each other, as checked by cross 
hybridization in stringent conditions. The size of the inserts in the 17 clones 
was estimated and the clone with the largest cDNA was retained for 
further analysis. A small scale preparation of this phage was prepared 
using the lambdasorb method, and the insert was excised by restricting 

10 with EcoRl. This insert was ligated into a pBLUESCRIPT II vector 
linearized with EcoRI, and the ligation mixture was used to transform E. 
coli strain DH5a. 

One of the recombinant clones was designated pBNDESS 
(Figure 2), and retained for sequencing. The sequence was determined on 

15 both strands, using the sequenase enzjmie, (US Biochemicals, Cleveland, 
OH) according to the instructions provided by the suppUer. The nucleotide 
sequence of the insert in pBNDESS is presented as Figure 3. The deduced 
amino acid sequence of the largest open reading frame in the nucleotide 
sequence is also shown in Figure 3. 

20 Comparison of the deduced amino acid sequence of the 383 

amino acid open reading frame in clone pBNDES3 against the known 
sequences in GenBank release 70 was performed using the FASTA 
program (Lipman and Pearson, 1985). This analysis revealed that the 
sequence from pBNDES3 had a region of significant homology to a 

25 previously characterized desaturase gene from the cyanobacterium 
Synechocystis (Figure 4). (Wada et al. 1990). This was considered 
suggestive evidence that the clone pBNDES3 encoded a desaturase which 
was probably the fadS structural coding sequence product. This was 
subsequently confirmed by a genetic complementation experiment. 
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The cDNA was cloned into plant transformation vector pBI121 
(Figure 5) under the control of the CaMV35S promoter to construct 
pTiDESS (Figure 6). Plasmid pTiDESS was introduced into an 
Agrobacterium tumefaciens strain whidi also carried an Ri plasmid and this 
5 was used to produce transgenic rooty tumors from both wild type 
Arabidopsis and the fadS mutant. Transgenic tissue was selected for 
antibiotic resistance to confirm the presence of the pTiDESS. Fatty acid 
methyl esters were then prepared and examined gas chromatography to 
determine the profile of fatty adds being produced in the tissue. The levels 
10 of linolenic acid increased, demonstrating that the cDNA on pTiDESS can 
complement the iadS mutation. These results, which are described in detail 
in Example 1 below, confirm the identity of the cDNA as encoding a linoleic 
add desaturase. 

The isolation of a plant structural coding sequence provides 
15 those skilled in the art with a tool for the maniptilation of gene expression 
by the mechanism of antisense RNA. The technique of antisense RNA is 
based upon introduction of a chimeric gene which will produce an RNA 
transcript that is complementary to a target gene (reviewed in Bird and 
Ray, 1991). The resulting phenotype is a reduction in the gene product 
20 from the endogenous gene. The portion of the gene which is sufficient for 
achieving the antisense effect is variable in that numerous fragments or 
combinations thereof are likely to be effective. Various portions of the 
structural coding sequence of linoleic add desaturase isolated either from 
cDNA or genomic clones are likely capable of reducing linolenic add levels in 
25 plants by reduction in levels of Unoleic add desaturase levels. An example 
of using an antisense oriented linoleic add desaturase structural coding 
sequence is set out in Example 2. 
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Polvadenvlation Simal 

The 3' non-translated region of the double stranded DNA 
molecule of the present invention contains a region that functions in plant 
cells to promote polyadenylation to the 3' end of the RNA sequence. Any 
5 such regions can be used within the scope of the present invention. 
Examples of suitable 3* regions are (1) the 3' transcribed, non-translated 
regions containing the polyadenylated signal of Agrobacterium tumor- 
inducing (Ti) plasmid genes, such as the nopaline synthase (NOS) gene, and 
(2) 3' regions of plant genes like the soybean storage protein genes and the 

10 small subunit of the ribulose-l,5-bisphosphate carboxylase (ssRUBISCO) 
gene. An example of a preferred 3* region is that from the NOS gene, 
described in greater detail in the examples below. 
Plant Transformation/Regeneration 

Any plant which can be transformed to contain the double- 

15 stranded DNA molecule of the present invention are included within the 
scope of this invention. Preferred plants which can be made to have 
increased or decreased linolenic acid content by practice of the present 
invention include, but are not limited to sxmflower, safilower, cotton, com, 
wheat, rice, peanut, canola/oilseed rape, barley, sorghum, soybean, flax, 

20 tomato, almond, cashew and walnut. 

A double-stranded DNA molecule of the present invention 
containing the functional plant linoleic add desatturase gene can be inserted 
into the genome of a plant by any suitable method. Suitable plant 
transformation vectors include those derived from a Ti plasmid of 

25 Agrobacterium tumefacienSj as well as those disclosed, e.g., by Herrera- 
Estrella (1983), Bevan (1984), Klee (1985) and EPO publication 120,516 
(Schilperoort et al.). In addition to plant transformation vectors derived 
from the Ti or root-inducing (Ri) plasmids oi Agrobacterium ^ alternative 
methods can be used to insert the DNA constructs of this invention into 
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plant cells. Such methods can involve^ for example, the use of liposomes, 
electroporation, chemicals that increase free DNA uptake, free DNA 
delivery via microprojectile bombardment, and transformation using 
bacteria, viruses or pollen. 
5 A plasmid expression vector, sxiitable for the expression of the 

linoleic acid desaturase gene in monocots is composed of the following: a 
promoter that is specific or enhanced for expression in the lipid storage 
tissues and a 3* polyadenylation sequence such as the nopaline sjoithase 3' 
sequence (NOS 3'; Fraley et al., 1983). This expression cassette may be 

10 assembled on high copy replicons suitable for the production of large 
quantities of DNA 

A particularly useful Agrobacte7iumA}ased plant transformation 
vector for use in transformation of dicotyledonous plants is plasmid vector 
pMONSSO (Rogers, S.G., 1987). Plasmid pMON530 (see Figure 7) is a 

15 derivative of pMON505 prepared by transferring the 2.3 kb Stul-Hindlll 
fragment of pMON316 (Rogers, S.G., 1987) into pMON526. Plasmid 
pMON526 is a simple derivative of pMON505 in which the Smal site is 
removed by digestion with Xmal, treatment with Klenow poljonerase and 
hgation. Plasmid pMONSSO retains all the properties of pMON505 and the 

20 CaMV35S-NOS expression cassette and now contains a unique cleavage 
site for Smal between the promoter and polyadenylation signal. 

Vector pMON505 is a derivative of pMON200 (Rogers, S.G., 
1987) in which the Ti plasmid homology region, LIH, has been replaced with 
a 3.8 kb Hindlll to Smal segment of the mini RK2 plasmid, pTJS75 

25 (Schmidhauser & Helinski, 1985). This segment contains the RK2 origin of 
replication, oriV, and the origin of transfer, oriT, for conjugation into 
Agrobacterium using the tri-parental mating procedure (Horsch & Klee, 
1986). Plasmid pMON505 retains all the important features of pMON200 
including the sjmthetic multi-linker for insertion of desired DNA fii^gments. 
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the chimeric NOS/NPTII'/NOS gene for kanamycin resistance in plant 
cells, the spectinomycin/streptomycin resistance determinant for selection 
in E. coli and A. tumefaciens, an intact nopaline synthase gene for facile 
scoring of transformants and inheritance in progeny and a pBR322 origin of 
5 replication for ease in making large amounts of the vector in E. coli. 
Plasmid pMON505 contains a single T-DNA border derived from the right 
end of the pTiT37 nopaline-t3npe T-DNA. Southern analyses have shown 
that plasmid pMON505 and any DNA that it carries are integrated into the 
plant genome, that is, the entire plasmid is the T-DNA that is inserted into 

10 the plant genome. One end of the integrated DNA is located between the 
right border sequence and the nopaline synthase gene and the other end is 
between the border sequence and the pBR322 sequences. 

When adequate numbers of cells (or protoplasts) containing the 
linoleic acid desaturase gene are obtained, the cells (or protoplasts) are 

15 regenerated into whole plants. Choice of methodology for the regeneration 
step is not critical, with suitable protocols being available for hosts from 
Leguminosae (alfalfa, soybean, clover, etc.), Umbelliferae (carrot, celery, 
parsnip), Cruciferae (cabbage, radish, rapeseed, etc.), Cucurbitaceae 
(melons and cucumber), Gramineae (wheat, rice, com, etc.), Solanaceae 

20 (potato, tobacco, tomato, peppers) and various floral crops. See, e.g., 
Ammirato (1984); Shimamoto, 1989; Fromm, 1990; Vasil and Vasil, 1990. 
Uses of Linoleic Acid Desaturase 

The present invention can be used for any modification (either 
increase, decrease, or mere change) of the oil content of a plant or plant 

25 tissue. Linolenic acid is an important constituent of several membranes in 
plant cells. 

One preferred method is to modify the oil content of the plant to 
improve the plant's temperature sensitivity. For instance, plants deficient 
in linolenic acid display reduced fitness at low temperature (Hugly and 
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Somerville, 1992). Also, increased linoleic acid content in vegetative tissues 
has been implicated as a factor in freezing tolerance in higher plants 
(Steponkus et al., 1990 and references therein). In a preferred 
embodiment, expression of the linoleic acid desaturase structural coding 
5 sequence can result in the genetic modification of higher plants to achieve 
tolerance to low environmental temperatures. Transformation with 
pTiDESS demonstrates that linolenic add levels can be increased by 
expression of this gene in a constitutive manner. Chilling or freezing injury 
in crops may be overcome by expression of this gene in vegetative or 

10 reproductive tissues by empl(qdng an appropriate promoter. 

Linolenic add, a polyunsaturated fatty acid, is also extensively 
used in the paint and varnish industry in view of its rapid oxidation. Flax 
seed is a predominant source of this oil. Higher quantities of this fatty add 
in rapeseed or soybean will provide opportunities for using vegetable oils 

15 from these sources as a replacement for linseed (flax) oil. Expression of a 
linoleic add desaturase structural coding sequence in seed tissue can result 
in a higher proportion of linolenic add in the storage oil. 

Linolenic acid is further a precursor in the bios3mthesis of 
jasmonic acid, an important plant growth regulator. Linolenic acid is 

20 converted to jasmonic acid by introduction of an oxygen to the carbon chain 
by a lipoxygenase, followed by dehydration, reduction, and several |J- 
oxidations (Vick and Zimmerman, 1984). The activity of jasmonic add has 
been measured in terms of induction of pathogen defense responses. By 
appUcation of free Unolenic add to plants, plant pathogen defenses can also 

25 be induced (Farmer and Ryan, 1992). A model has been proposed to explain 
the ability of free linolenic acid to exhibit the effects associated with 
jasmonic add (Farmer and Ryan, 1992), It is hypothesized that all of the 
enzjmiatic activities which are required for the conversion of linolenic add 
to jasmonic add are constitutively present in the cell and the rate limiting 
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step in the production of jasmonic acid is the availability of free linolenic 
add. A likely route for the production of the free linolenic acid is by the 
activity of a lipase in the plasma membrane. 

It further has been observed that exogenous jasmonic acid can 
5 more powerfully activate defense responses than can wounding. This 
suggests that wounds cannot generate enough free linolenic acid to support 
high level production of jasmonic add. The activity of the lipase or the 
availability of appropriate substrate for the lipase may be rate limiting 
upon woimding. By increasing levels of available substrate, increasing 

10 linolenic acid levels in the plasma membrane, it should be possible to 
enhance a plant's ability to respond to pathogens by allowing for a higher 
production of jasmonic add. Expression of a linoleic add desaturase 
structural coding sequence can result in a higher molar percent linolenic 
add in the plasma membrane of a plant cell therefore enhancing the 

15 jasmonic acid signaling pathway. It is our intent to evaluate plants 
containing high linolenic acid levels in root and foliar tissues for their 
pathogen resistance. 

It is also imdesirable to have significant levels of linolenic acid in 
cooking oils. Linolenic add is unstable during cooking and is rapidly oxidized. 

20 The oxidized products impart randdity to the finished product. Rapeseed or 
soybean oil containing less than about 3%, and preferably 2% or less of 
linolenic acid is ideal for use as a cooking oil. By expression of the antisense 
of the structural coding sequence for linoleic add desaturase^ it is possible 
to reduce the linolenic acid content of these oils. 

25 All higher plants have linolenic add and, therefore, contain genes 

for linoleic add desaturases. Because of the many examples in which genes 
isolated from one plant species have been used to isolate the homologous 
genes from other plant species, it is apparent to any one skilled in the art, 
that the results presented here do not only pertain to the use of the B. 
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napus fad3 gene, or to the use of the gene to modify fatty acid composition 
in B. Tuxpus , Obviously, the linoleic add desaturases from many organisms 
could be used to increase linolenic acid biosynthesis and accumulation in 
plants and enzymes from any other higher plant or algae can serve as 
5 sources for Unoleic add desaturase genes. For example, since a YAC 
containing the Arabidopsis gene was used to isolate the B. napus gene, it is 
apparent that the insert in pBNDESS could be used as a probe of genomic 
libraries for isolation of the corresponding full length genes from other plant 
spedes. It is also likely that the information contained in the sequence of 

1 0 this gene will be useful to done other Upid desaturases genes. 

Expression of a linoleic add desaturase in a sense orientation 
may also allow for the isolation of plants with reduced levels of linolenic 
add. This could be accomplished by the mechanism of co-suppression (Bird 
and Ray, 1991). The molecular mechanism of co-suppression is at this 

15 time poorly understood but occurs when plants are transformed with a gene 
that is identical or highly homologous to an allele foimd in the plants 
genome. There are severed examples where expression of a chimeric gene in 
plants can result in a reduction of the gene product from both the chimeric 
gene and the endogenous gene(s). Those skilled in the art will recognize that 

20 the resulting decrease in linolenic add would be a direct result of expression 
of the linoleic acid desaturase structural coding sequence and would be 
correlated to the linoleic add desaturase activity in the transformed plant. 

Linolenic acid levels in plant cells can also be modified by 
isolating genes encoding transcription factors which interact with the 

25 upstream regulatory elements of the plant linoleic acid desaturase gene(s). 
Enhanced expression of these transcription factors in plant cells can effect 
the expression of the linoleic add desaturase gene. Under these conditions, 
the increased or decreased linolenic acid content would also be caused by a 
corresponding increase or decrease in the activity of the linoleic add 
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desaturase enzyme although the mechanism is different. Methods for the 
isolation of transcription factors have been described (Katagiri, 1989). 

The following examples are provided to better elucidate the 
practice of the present invention and should not be interpreted in* any way 
5 to limit the scope of the present invention. Those skilled in the art will 
recognize that various modifications, truncations, etc. can be made to the 
methods and genes described herein while not departing from the spirit and 
scope of the present invention. 
Example 1 

10 Expression of fad3 gene tp increase linolenic add 

To verify the assumption that the cDNA insert in pBNDESS 
encodes a linoleic acid desaturase, both wild type and fadS mutation 
Arabidopsis were transformed to contain the cDNA insert. In order to 
express the linoleic acid desaturase structural coding sequence (hereafter 

15 referred to as the "fadS gene") in plant cells, the plasmid pBNDESS was 
digested with Xhol and the ends were filled in with the Klenow fragment of 
DNA polymerase (Sambrook et al 1989)^ The cDNA insert was 
subsequently excised by digestion with Sacl and ligated into the Sacl and 
Smal sites of the binary Ti plasmid vector pBI121 (Clontech 

20 Laboratories), thereby replacing the GUS reading frame. The ligation 
reaction was carried out in 20 for 12 h at 16 *C using 100 ng of both 
insert and vector, and one unit of T4 DNA ligase. The ligation mixture was 
used to transform competent DH5a E. coli cells (prepared by the calcium 
chloride method, according to Sambrook et al, 1989), and transformants 

25 were selected on L*broth plates that contained 50 fig/p-l Kanamycin. 
Alkaline minipreparations of recombinant clones were analyzed for the 
correct restriction pattern; One of these plasmids, designated pTiDESS, 
was used for further experiments. 
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This plasmid was electroporated (according to Mersereau and 
Pazour, 1990) into Agrobacterium tumefaciens strain RIOOO which carries 
an Ri plasmid. The transformed bacteria were selected on kanamycin LB 
plates for 2 days at 30 "C. DNA minipreparations of several recombinant 
5 bacteria were performed and analyzed as described above to verify the 
presence of the construct. 

Yoimg flowering stems of wild type and the fadS mutant of 
Arabidopsis were sterilized for 30 min in 10% commercial bleach, 0.02% 
Triton XlOO, and 2-cm explants that contained the flowering stem were 

10 infected with RIOOO (pTiDES3) This was performed by dipping the 
sectioned extremity in a drop of an overnight culture of the appropriate 
Agrobacterium that was grown from a single colony in LB medium 
supplemented with 50 ug/ml Kanamycin. 

The infected stems were cultured for two days on solid MSO 

15 mediimi (Gibco MS salts plus Gamborg B5 vitamins, 3% sucrose and 0.8% 
agar). At this time the stem segments were transferred for 5 weeks to 
MSO medixmi containing 200 ^g/ml cefotaxime to kill the bacterium- After 
approximately two weeks, most of the stem explants had developed rooty 
tumors resulting from transfer of parts of the Ri plasmid into cells of the 

20 stem explants. In order to identify the rooty txxmors which had also 
received the binary Ti plasmid pTiDES3, approximately 24 rooty tumors 
from each treatment were transferred to MSO medium containing 50 ^tg/ml 
of kanamycin to select for the growth of those roots which had been 
cotransformed with the binary Ti plasmid; the mediimi contained also 200 

25 pg/ml of cefotaxime to inhibit bacterial growth. Following a further period of 
growth for 2 weeks, fatty acid methyl esters were prepared (as described 
above) from the roots for analysis by gas chromatography. The results of 
these analyses are presented in Table 2. 
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Tfthle2. Genotype 



mol% vvildtype fad 3 wildtype fadS 



5 



Fatty add 


pBI121 


pBI121 


pTiDES3 


pTiDES3 


16:0 


22.0±2.9 


21.2±1.6 


21.1+0.9 


21.3±2.3 


16:1 


2.5±0.7 


1.6±0.8 


2.0±0.1 


1.5±0.2 


18:0 


2.3±1.9 


2.3±1.9 


1.9±0.2 


1.6±0.4 


18:1 


3.8±1.3 


5.9±2.6 


7.7+2.0 


9.1+2.0 


18:2 


37.3±3.7 


62.2±5.9 


15.7±11.7 


24.4±14.9 


18:3 


31.9±4.5 


6.7±0.7 


51.3±10.9 


42.1±15.5 



Table 2 shows the fatiy acid composition of transgenic roots. 
The transgenic roots resulting from infection of wild type or the fadS 
mutant with A. tumefaciens RIOOO carrying the vector (pBI121) or the 

15 plasmid pTiDES3 were grown in the presence of kanamycin (50 g/ml) for 
three weeks to identify the roots which had been cotransformed with one of 
these plasmids. The fatty add composition of the roots was determined as 
previously described (Browse et al., 1986). The abbreviations used in Table 
2 are as follows: 16:0» palmitic add; 16:1, palmitoleic add; 18:0, stearic add; 

20 18:1, oleic add; 18:2, linoleic add; 18:3, linolenic add. The values presented 
are the mean ± SD (n=12). 

From these results it can be seen that the production of rooty 
tumors containing pBI121 on wild type Arabidopsis or the fadS mutant had 
no effect on the fatty acid composition over non-pBI121 containing wild 

25 t3T>e Arabidopsis or fadS mutant. By contrast, transformation of the fadS 
mutant with the plasmid pTiDESS resulted in large increases in the 
content of linolenic acid. In contrast to the linolenic acid content of 6.7 +/- 
0.7% in the fadS mutant transformed with pBI121, the presence of 
pTiDESS resulted in accumulation of 42.1% of the fatty acids as linolenic 

30 acid. The increased content of linolenic acid was accompanied by a 
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decrease of corresponding magnitude in the content of linoleic add. Thus, it 
is clear that the fad3 gene encodes a linoleic acid desaturase. Introduction 
of the fads gene into wild type tissues also resulted in significantly 
increased accumulation of linolenic acid and a corresponding decrease in 
5 linoleic acid (Table 2). Thus, it is apparent from these results that the 
linoleic acid content of plant tissues can be increased by high level 
expression of a linoleic acid desaturase. In the present embodiment, the 
iadS gene was placed under transcriptional control of the constitutive high 
level CaMV 35S promoter carried on pBI121. The impUcation from these 

10 results is that expression from this promoter raised the level of expression 
of the fads gene to levels higher than are normally achieved by expression 
from the endogenous fadS promoter. The results presented here indicate 
that the fadS gene has significant utiUty in genetic modification of higher 
plants to elevate linolenic add levels. 

15 Exampk a 

Antisense exnression of fadS gene to decrease linolenic add levels 

In order to decrease the linoleic acid desaturase activity by 
genetic engineering methodology, the cDNA insert of pBNDESS was cloned 
into plant expression cassettes in an antisense orientation. A 959bp Bglll 

20 restriction fragment of pBNDESS was used in the antisense expression 
vectors. The fragment is from 152 nucleotides downstream of the initiating 
methionine codon of the cDNA to a second BgUI restriction site that is 
located near the C-terminus of the coding region. 189 nucleotides of the 
coding region are excluded from this fragment. Triple ligations were 

25 performed with the fadS gene fragment to construct two separate plant 
expression cassettes. 

A seed specific expression cassette was constructed by insertion 
of the Bgin fragment of pBNDESS in an antisense orientation behind the 
soybean promoter for the a' subunit of p-conglycinin (7S promoter). A 
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975bp Hindlll to Bglll fragment containing the 7S promoter derived from 
pMON529 was prepared by digesting with Bglll for SOmin at 37 'C followed 
by addition of Calf Intestinal Alkaline Phosphatase (CIAP) (Boehringer 
Mannheim). The reaction was allowed to proceed for 20min followed by 
5 purification of the linearized DNA using the GeneClean (Bio 101) 
purification system. The DNA was then digested with HindUL Afiragment 
derived from pMON999 containing the Nopaline synthase 3' region and the 
pUC vector backbone was prepared by digestion with BamHI and 
treatment with CIAP. The DNA was purified by the GeneClean procedure 

10 and digested with Hindlll. The firagment of pBNDESS was prepared by 
digestion with Bgin. The three fragments were purified by agarose gel 
electrophoresis and the GreneClean procedure. 50 to 200ng of the purified 
fragments were ligated for one hour at room temperature followed by 
transformation into the E. coli strain JMIOI. Resulting transformant 

15 colonies were used for plasmid preparation and restriction digestion 
analysis. Double digestion with Bglll and Ncol was used to screen for 
transformants containing the fadS gene in an antisense orientation. One 
clone was designated as correct and named pMON13801, 

A second expression cassette was constructed to allow for 

20 constitutive expression of the antisense message in plants. A fragment 
containing the enhanced 35S promoter was prepared from pMON999 by 
restriction digestion with Hindlll and Bglll followed by treatment with 
CIAP as above. The correct sized fragment was obtained by agarose gel 
electrophoresis and the GeneClean procedure. The Bglll to Hindlll vector 

25 fragment and the Bglll fragment of pBNDES3 which were purified above 
were used in this construction. Ligation, transformation and screening of 
clones were as described above. One clone was designated as correct and 
named pMON13802. 
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In both pMON13801 and pMON13802, the promoter, fad3 gene 
and the Nos 3' region can be isolated on a Not! restriction fragment. These 
fragments can then be inserted into a unique NotI site of the vector 
pMON17227 to construct gljT^hosate selectable plant transformation 
5 vectors. The vector DNA is prepared by digestion with NotI followed by 
treatment with CLAP. The fad3 containing fragments are prepared by 
digestion with NotI, agarose gel electrophoresis and purification with 
GeneClean. Ligations are performed with approximately lOOng of vector 
and 200ng of insert DNA for 1.5 hoiirs at room temperature. Following 

10 transformation into the E. coli strain LE392, transformants were screen 
by restriction digestion to identify clones containing the fadS expression 
cassettes. Clones in which transcription from the fadS cassette is in the 
same direction as transcription from the selectable marker were designated 
as correct and named pMON13804 (FMV/CP4/E9, 7S/anti fad3/NOS) 

15 (Figure 8) and pMONl3805 (FMV/CP4/E9, E35S/anti fad3/NOS) (Figure 
9). 

In preparation for transforming canola cells, pMON13804 and 
pMON13805 were mated into Agrobacterium ABI by a triparental mating 
with the helper plasmid pRK2013. 

20 Seeds from the plants produced by transformation were 

analyzed for alterations in fatty acid profile. Fatty acid methyl esters 
(FAMES) were prepared from seed tissue and analyzed by capillauy gas 
chromatography (Browse et al, 1986). For initial screening of plants, six 
seeds were pooled together from an individual plant. The seeds were 

25 crushed and FAMES extracts were made. Control plants, plants 
transformed with the selectable marker only (pMON17227), were also 
analyzed using the identical procedure. From the initial screen on pooled 
seed samples, several lines were identified which displayed a decreased level 
of linolenic acid. Lines with decreased levels of linolenic acid were 
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reaiial3rzed by determining fatty add profiles from individual seeds. Four to 
twenty individual seed were analyzed from candidate lines and from 
selected control plants. The results of the FAMES analysis is simmiarized 
in Figure 9. 

5 Figure 9 shows the levels of fatty acids expressed in molar 

percent of twenty individual seed of the transgenic line 13804-51 as 
compared to control seed. Panel A discloses oleic acid, panel B discloses 
linoleic add and panel C disdoses linolenic add. 

The data in Figure 9 demonstrate that antisense expression of a 

10 linoleic add desaturase has significantly altered the fatty add profile of the 
resulting seed tissue. The percent of linolenic add has been reduced to a 
little over 2% of the total fatty add in the seed tissue. The percent of 
linoleic acid has been reduced slightly and suiprisingly, the percent of oleic 
acid in the seed has been increased to approximately 70%. This 

15 demonstrates the applicability of utilizing the fad3 gene to manipulate the 
fatty add profile of crop plants. 

In order to demonstrate that the alteration in the fatty acid 
profile of the FAMES extracted fi*om total seed tissue wotdd be reflected in 
the seed oil fraction, triglycerides from seeds of fadS antisense plants were 

20 characterized. Total lipid extracts were made by poohng ten seeds and 
grinding in 2ml of methanol:chloroform:water (4:2:1). The homogenate was 
allowed to stand for 20min and then debris was pelleted and discarded. To 
the supernatant 400^1 of chloroform:methanol (2:1), 640^1 of chloroform 
and 740mJ of water was added and vortexed. Phases were separated by 

25 centrifugation and the chloroform phase was recovered and dried under 
nitrogen. Samples were resuspended in 100^1 of chloroform and 10^1 was 
applied to silica gel G thin layer chromatography plates for separation. 
Two identical plates were prepared with one being charred after 
development to allow for alignment and location of spots to be analyzed on 
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the other plate. Plates were developed three times in petroleum 
etherrdiethyl ether:acetic acid (90:10:1). One plate was sprayed with 50% 
sulfuric acid and heated in an oven at 90'C to allow for detection of lipids. 
Triglyceride fractions were identified as comigrating on the palate with 
5 purchased lipid standards (Sigma Chemical Co, cat #178-13). The charred 
plate was aligned with the identical plate and the triglyceride fractions were 
scraped from the plate. The fatty acids were transesterified to produce 
FAMES extracts for GC analysis by the same procedure as above. The 
fatty acid profiles of the triglyceride fractions are shown in Table 3 and 
10 demonstrate that this fraction have decreased linolenic add. 



TABLg3 





Transgenic 


Mol% 






15 


line 


18:1 


18:2 




17227-10 


44 


30 


15.3 




17227-493 


65 


17 


6.9 




13804-47 


58 


21 


4.3 


20 


13804-50 


67 


20 


2.8 




13804-76 


59 


19 


5.0 




13804-117 


62 


21 


4.0 



Table 3 compares the fatty acid molar percentages of 
25 triglyceride fractions from control and transgenic lines. These above 
results provide clear evidence that the fad3 gene can be used to decrease 
the levels of linolenic acid in the storage oil of plants. The gene provides a 
tool for the manipulation of the fatty acid profile of seed storage oil to 
improve the products derived from the oil. 
30 A surprising result of this Example 2 is the effect the antisense 

fad3 gene has on the oleic acid content. The precise mechanism by which 
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antisense expression of a gene exerts an effect on the activity of an 
endogenous gene is uncleau: but is obviously a function of the homology of 
the sense and antisense gene products. Based upon the above 
experimental result, it would not be imreasonable to predict that the 
5 portion of the fad3 gene antisense message used contained a certain degree 
of homology with the genes providing the activity of one or more oleate 
desaturases. Therefore, a further advantage of the above invention is that 
it is possible that expression of a linoleic acid desaturase antisense 
message may exert an effect on oleate desaturase activity. 

10 The unexpected nature of the reduction in oleic add desaturase 

activity from the antisense fadS plants is most apparent when one 
compares the fatty acid profiles from the antisense plants and the fadS 
mutant of Arabidopsis. The levels of linoleic add in the fadS mutant plants 
increased when linoleic acid desaturase activity was eliminated by 

15 mutation. This indicates that the activity of the oleate desatixrase was not 
highly effected by the loss of linoleic add desaturase activity or by the 
accumulation of linoleic acid. In the fadS mutant of Arabidopsis the level of 
linoleic add increased when the level of linolenic add decreased. However, a 
different pattern occurred in the antisense fad3 plants. In plants which 

20 exhibit a decreased percent of linolenic acid there is no corresponding 
increase, and is often a decrease, in the percent of Unoleic acid. There is am 
increase in the percent of oleate in the antisense fad3 plants. This would 
indicate that oleate desaturase activity is depressed in these plants. The 
effects on the fatty acid profile by the fadS mutation and the fadS antisense 

25 expression are not equivalent, indicating that antisense expression, of a 
linoleic acid desaturase can depress an oleate desaturase activity in plants. 
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gxftn^ple 3 

Modification of linolenic acid levels in soybean 

The isolation of the fad3 gene firom B. napus provides a tool to 
those with ordinary skill in the art to isolate the corresponding gene or 
5 cDNA from other plant species. There are many examples in which genes 
from one plant species have been used to isolate the homologous genes 
from another plant species. One such plant which could be improved upon 
by the modification of the level of linolenic add is soybean. 

Soybean oil typically contains linolenic add at a level of 7-9% of 

10 the fatty add in the oil. This level is undesirable because it promotes 
instability upon heating and imparts randdity to the finished product. The 
levels of linolenic add can be lowered by the expression of the soybean fadS 
gene or cDNA in an antisense orientation in the developing seed. The 
following example describes one method for the isolation of a fadS cDNA 

15 from soybean. However , similar procedures could be followed to isolate a 
genomic clone which could also be used to decrease the level of linoleic add 
desaturase activity by antisense expression of a portion or all of the gene. 

The fads gene from B. napus is used as a probe to screen a cDNA 
library constructed from soybean mRNA. In order to isolate a cDNA to be 

20 used in decreasing linolenic acid in seed, the optimal tissue to use for the 
isolation of mRNA is developing seed. There is, however, flexibility in the 
choice of methods and vectors which can be used in the construction and 
analysis of cDNA libraries (Sambrook et al, 1989). Procedures for the 
construction of cDNA libraries are available from manufacturers of cloning 

25 materials or from laboratory handbooks such as Sambrook et.al, 1989. 
Once a suitable cDNA library has been constructed from soybean, all or a 
portion of the fadS cDNA from B. napus is labeled and used as a probe of the 
library. DNA fragments can be labeled for radioactive or non-radioactive 
screening procedures. The library is screened under suitable stringency. 
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Conditions are dependent upon the degree of homology between the fadS 
gene of B. napus and soybean. Probe positive clones are plaque purified by 
standard procedures and characterized by restriction enzyme mapping and 
DNA sequence analysis. Clones are concluded to be soybean fadS based 
5 upon data obtained from the sequence analysis or by expression in plants. 

The entire clone or a portion thereof is placed down stream of a 
promoter sequence in an antisense orientation. Suitable promoters include 
seed specific promoters, such as the 7S (P-conglycinin) a'-subunit 
promoter, or less tissue specific promoters, such as the CaMV 35S 

10 promoter. An appropriate 3' non*translated region is placed downstream of 
the antisense cDNA to allow for transcription termination and for the 
addition of polyadenylated nucleotides to the 3'end of the RNA sequence. 
This expression cassette is then combined with a selectable or scorable 
marker gene and soybean cells are transformed by free DNA delivery 

15 (Christou et al, 1990) or an Agrobacterium based method of plant 
transformation (Hinchee et al, 1988). Plants recovered are allowed to set 
seed and mature seed are used for the production of FAMES by the 
procediires outlined above. The FAMES extracts are analyzed by gas 
chromatography to identify plant Unes with reduced levels of linolenic acid 
20 in the seed. 

Alternatives to the above methods may include but are not 
limited to the use of degenerate oligonucleotides as probes to screen the 
library. Degenerate oligonucleotide probes would be most optimally 
designed by choosing short segments of the fadS amino acid sequence where 
25 the degeneracy of the genetic code is limited or by choosing sequences which 
appear to be highly conserved between the fad3 gene of B. napus and other 
known linoleic acid desaturases, such as the desaturase from the 
pyanobacterium Synechocystis. The oligonucleotides could be labeleid and 
used to probe a soybean cDNA library. Alternatively, degenerate 
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oligonudeotides could be used as primers for the isolation of a portion or all 
of the soybean cDNA by PGR amplification. 

Similar procedures could be used to isolate the homologous genes 
from other plant species. Another preferred plant species which could be 
5 improved upon bythe modification ofthe level of linolenic add is flax. Flax 
oil tsrpically contains linolenic add at a level of 45-65% of the fatty add in 
the oil. This level is undesirable because it promotes instabiUty upon 
heating and imparts randdity to the finished product. 
Example 4 

10 Sense expression of fad3 to obtain reduced levels of linolenic acid 

The cloning ofthe fadS gene also provides a tool to decrease the 
levels of linolenic add via the mechanism of co-suppression. The molecular 
mechanism of co-suppression occurs when plants are transformed with a 
gene that is identical or highly homologous to an allele found in the plants 

15 genome (Bird and Ray, 1991). There are several examples where 
expression of a chimeric gene in plants can result in a reduction ofthe gene 
product firom both the chimeric gene andrthe endogenous gene(s). Therefore 
the fads gene product of J3. napus may be reduced by transformation of B. 
napus with all or a portion ofthe fadS cDNA which has been isolated. The 

20 resulting plant has reduced linoleic add desaturase activity in tissues 
where the chimeric gene is expressed. The phenotype of reducing the 
linoleic add desaturase activity is a reduction in the levels of linolenic add. 
The mechanism of co-suppression could be applied to any plant species 
from which the fadS gene is cloned and the plant species is transformed 

25 with fads in a sense orientation. 

In order to reduce levels of linolenic acid by the mechanism of co- 
suppression, a plant transformation construct is assembled with the fadS 
gene or cDNA in a sense orientation. The entire clone or a portion thereof is 
placed downstream of a promoter sequence in a sense orientation. Suitable 
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promoters include seed specific promoters, such as the 7S ( P-conglycinin) 
a'-subunit promoter, or less tissue specific promoters, such as the CaMV 
35S promoter. An appropriate 3' non-translated region is placed 
downstream of the fadS gene to allow for transcription termination and for 
5 the addition of polyadenylated nucleotides to the 3' end of the RNA 
sequence. This expression cassette is then combined with a selectable 
marker gene and B. napus cells are transformed by an Agrobacterium 
based method of plant transformation. Plants recovered are allowed to set 
seed and mature seed are used for the production of FAMES which are 
10 analyzed by gas chromatography to identify plant lines with reduced levels 
of hnolenic add in the seed, 
gxample 5 

TsnlatioTi of a chloroplast delta 15 desaturase fromArabidoP^P^ 

A fragment of 959bp was excised from the fad3 cDNA insert 

1 5 using the restriction endonudease Bglll, and labeled radioactively according 
to Feinberg and Vogelstein (1983). This firagment was used to probe a 
cDNA library firom Arabidopsis thaliana as described above (Example 1) 
except that the hybridization temperature was 52'' C. Several cDNA 
clones were positive, and one of them (pVAl) was further characterized. 

20 Its deduced amino acid sequence exhibited a strong homology with fadS 
except at the N-terminus. The cDNA insert was placed imder the control of 
the 35S promoter in the Ti vector pBI121, and the resulting construct, 
pBIVA12 was electroporated into Agrobacterium (C58 pGVSlOl). The 
bacterium was used to transform the Arabidopsis mutant fadD. For 

25 transformation, plants were grown at 22° C with a light intensity of 
lOO/M.E/cm-2, until bolting (approximately 2 and 1/2 weeks). The stems 
(Imm-lOmm long) were removed and the plants were inoculated with a 
drop of an overnight culture of the bacterium. The same operation was 
repeated 7 days afterwards. 



SUBSTITUTE SHEET (RULE 26) 



wo 94/1S337 



PCTAJS94/01321 



^5- 



The plants were then allowed to set seeds. The seeds were 
plated (2500 seeds per ISOinm petri dish) on MSO plates that contained 
SOjig/ml kanamycin to select for plants that had integrated the construct. 
One transfcrmant plant was obtained, and the fatty acids from its leaves 
5 were analyzed by gas chromatography (Table 4). The results obtained 
show that the pBIVA12 construct is able to reestablish the levels of 
Unolenic and hexadecatrienoic acids in the fadD mutant at a level equal to 
or superior to the wild type. This demonstrates that pVA12 encodes the 
fadD gene. 

10 

TABLE 4 



fatty add 




WT 


FadD 








pBIVA12 


16:0 


13.0 


14.0 


14.9 


16:1 


4.9 


4.3 


4.2 


16:2 


8.7 


0.5 


0.3 


16:3 


3.0 


13.2 


9.5 


18:1 


3.3 


2.3 


1.2 


18:2 


36.4 


10.9 


5.8 


18:3 


30.8 


54.6 


63.7 



Table 4 shows the complementation of the fadD mutant. 
25 Fatty acids were extracted from leaves of Arabidopsis according to Browse 
et al (1986) and were quantified (mol%) by gas chromatography. WT 
stands for the Columbia wild type. 
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Isolation of a second chloroplast delta 1 5 desaturase from ArabidoDsis 

A fragment of 959 bp was excised from the cDNA insert using 
the restriction endonuclease Bglll, and labelled radioactively according to 
5 Feinberg and Vogelstein (1983). This fragment was used to probe a cDNA 
library from Arabidopsis, exactly as described above (Example 5). Among 
the several positive clones obtained, the cDNA pVA34 was further 
characterized. Its deduced amino acid sequence exhibited 71.8% and 79.5% 
homology with fadS and fadD, respectively. The N-terminus resembled a 

10 chloroplast transit peptide, meaning that this protein is likely to be 
localized to the chloroplast. The strong homology with fadS and fadD 
suggests that the protein is also a delta 15 desatiarase. Aside from fadS 
and fadD, the only locus known to control delta 15 desaturation is the fadE 
locus, which controls a temperature-induced delta 15 desaturase. 

15 Therefore, it is likely that the cDNA contained within the clone pVA34 
corresponds to the fadE locus. 
Example 7 

I^inoleic desaturase homolo fv to plant oleic desaturases 

The linoleic desaturase genes are the first plant desaturases 

20 isolated whose proteins enzjnoaatically perform the desaturation of an 
imsaturated fatty acid precursor. The reaction that linoleic desaturase 
performs and the cofactors it uses are likely to be very similar for the oleic 
desaturase reaction. Given the similar reactions, similar substrates and 
probably similar cofactors, it is likely that the oleic desaturase genes and 

25 proteins have homology to the linoleic desaturase genes and proteins. That 
the genes share homology is supported by the finding that antisense 
expression of the linoleic acid desaturase message results in higher oleic 
acids levels, which experimentally indicates homology between the linoleic 
and oleic desaturases. These factors indicate that the linoleic desatin-ase 
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protein and nucleic acid sequences provide useful information for isolating 
other lipid desaturase genes, particularly oleic desaturase genes. 

a. Identificatio Ti of unknown cDNA sequences in databases, 
5 Random cDNA sequencing generates a large number of 

sequenced clones but provides no information about the function of the 
encoded proteins. Homology to known proteins is the quickest method for 
identifying the protein function encoded in the sequenced cDNA. However, 
homology searches are informative only when a homology with a previously 

10 characterized protein are found, A cDNA sequence that is not homologous 
to any known protein remains in the imknown function category. Thus the 
results functionally identifying the linoleic desaturases by sequence and by 
their ability to complement mutations in plant desaturase genes now 
provides a method for identifying the function and identity of random cDNA 

15 clones by their homology to the linoleic desaturases. Additionally oleic 
desaturases are identified by their homology with linoleic desaturases. 

A TFASTA search of the GenBank and EMBL public data - 
bases for genes encoding proteins homologous to the protein sequence of the 
linoleic desaturase fad3 has identified both linoleic desaturases and a 

20 second class of plant lipid desaturases likely to be oleic desaturases. In 
particular, sequences found in GenBank and EMBL and identified as 
T04093 and T12950 show significant homology to linoleic desaturases but 
show less homology than other Unoleic desatxirases. These sequences have 
30% homology to fadS and 56% similarity to fad3 linoleic desatxirase 

25 (TABLE 5). The full length clone of these cDNAs is obtained by standard 
methods and is inserted into plant gene expression and transformation 
vectors and transformed into fad2 Arabidopsis mutants to confirm the 
identity of the oleic desaturase by genetic complemention as was described 
in the example with linoleic desaturase. 
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TABLE5 

Comparison of Fad3 and T04093 Protein Sequences 

Percent Similarity: 52.381% Percent Identity: 30.476% 



fad3 101 GHGSFSDIPLLNSWGHILHSFILVPYHGWRISHRTHHQNHGHVENDESW 150 
10 I : I 1 I : i I 1 I : i : . I I I I 1 I I ; 1 . I I : 

T04093 1 LIFHSFLLVPYFSWKYSHRRHHSNTGSLERDEVF 34 



151 VPLPEKLYKNLP HSTRMLRYTVPLPMLAYPIYLWYRSPGKEGSHF 195 

II ... .1 . . . . I : : . . I I . : : I : : | : 1 | : . . | : . : 

35 VPKQKSAIKWYGKYLNNPLGRIMMLTVQF.VLGWPLYLAFNVSGR. . .PY 80 



196 NPYSSLFAPSERKLIATSTTCWSIMLATIiVYLSFLVDPVTVLKVYGVPYI 24 5 

: . : . : I I : . : : . : 

20 81 DGFACHFFPNAPIYNDRERSRYTSLMRVF* HO 



b. Isolation of a oleic desaturase cDNA. 

25 The protein sequence of plant linoleic desaturases can be used 

to isolate oleic desaturases. The conserved regions between the linoleic 
desat\arases and the DesA oleic desaturase are functionally important and 
are conserved in the plant oleic desaturase proteins as well. These 
conserved amino acid sequences provide a method of isolating plant oleic 

30 desaturases. There are several regions of the linoleic desaturase fadS that 
are conserved in fadD, fadE and DesA. The consensus amino acid sequence 
is shown in Table 6, with the amino acids identical in all four proteins shown 
in capital letters. As described below, oligonucleotides designed to encode 
the amino acids sequences in the conserved regions are used to identify and 

35 isolate plant oleic desaturases. 
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TABIiE 6 

Fad3 Protein Sequence and Peptide Targets 

MWAMDQRSNVNGDSGARKEEGFDPSAQPPFKIGDIRAAIPKHCWVKSPLRSMSYVTRD' 
5 V . tpl t tp • . . spseed . . er f dpgapppf . laDIraaiPKhCwvKnpwksmsyVvxd 

nTT-aaiPKhCwvK 
(la) DIraaiP 

(lb) aiPKhC 

(Ic) KhCwvK 

10 

IFAVAALAMAAVYFDSWFLWPLYWVAQGTLFWAIFVLGHDCGHGSFSDIPLLNSWGHIL 
va . vf alaa . aay f nnW . IwPlyW . aqGTmf walFVlGHDCGHgSFsndp . INswGH . 1 
MflwPlvWvaoGT FVIGHPCGHCTSF 
(2a) WflwPlyW (3a) FVIGHD 

15 (2b) WflwP (3b) VIGHDC 

(2c) wPlyW (3c) GHDCGH 

(2d) WvaqGT (3d) CGHgSF 

HSFILVPYH(3WRISHRTHH0NHGHVENDESWVPLPEKLYKNLPHSTRMLRYTVPLPMLAY 
20 hssilvPyHgWRisHrtHHqnhghvEnDesWhPl.ekiyknlpk.trmfrftlplpmlay 

Px/HrrWRisHrtHH EnPgSWvP 

(4a) PyHgW (5a) EnDesW 

(4b) HgWRisH (5b) DesWvP 

(4c) WRisHrtHH 
25 (4d) WRisH 

(4e) HrtHK 



30 



PIYLWYRSPGKEGSHFNPYSSLFAPSERKLIATSTTCWSIMLAT . LX^YLSFLVDPX^VLK 
pfylw.rspgk.gShyhpds . IF .pkerkdvltStacwtamaAl . IvcLnf t .gpiqmlK 



VYGVPYIIFVMWLDAVTYLKHHGHDEKLPWYRGKEWSYLRGGL . TTIDRDYG . IFNNIH 
lygiPywif vmWldf vTy IHHhghedklpwyrgkeWSylrggL . tTldrDYg . winnih 
WldavTvlHH WSvlraaL . tTidrPY 

(6a) WldavT (7a) WSylrggL 

35 (6b) TylKH (7b) L tTidrD 

(7c) TidrDY 
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HDIGTHVIHHLFPOIPHYHLVDATRAAKHVLGRYYREPKTSGAIPIHLVESLVASIK 
HDIgtHviHHLfpqIPhYhLveAteaaKpvlGlcyyrEpk. sgplplhLlesl . ksik 

wnTrrl-H^ri HHLfngTPhY 
5 (8a) HDIgtH 

(8b) HviHHL 

(8c> HHLfpql 
(8d) HLfpqIP 
(8e) LfpqIPhY 



KDHYVSDTGDIVFYETDPDLYVYASDKSKIN* 
. dhyvsdtGdwyY eadp . lyg . . s * 



15 c. Tsolatdon of t he fadC ffadfi^ Gene from Arabidovsis thaliana 

The fade gene (also referred to as fad6) encodes a 
chloroplastic omega-6 desaturase. 

The deduced amino acid sequences of the fad3 gene from 
Brassica napus and the fadD and fadE genes from Arabidopsis thaliana 

20 were compared with the DesA gene from Synechocystis (Nature, 347:200, 
1990). The sequence GHDCGH was determined to represent the most 
highly conserved region of these proteins. Consequently, a degenerate 
oligomer was designed that contains all the possible condons for the 
sequence GHDCGH. This oligomer has the following sequence: 

25 GGNCAYGAYTGYGGNCA. 

An Arabidopsis thaliana cDNA phage library obtained from 
the laboratory of Dr. Ron Davis (PNAS, 88: 1731-1735) was used to screen 
for desaturase genes. This library was made using material from all above 
ground plant parts. 

30 Approximately 120,000 phage from the library were plated 

onto three plates and hybondN+ was then used to prepare three filters from 
esicki plate (Molecular Cloning ' A Laboratory Manual , 2nd Edition. Eds. J. 
Sambrook, E. F. Fritsch, and T. Maniatis, Cold Spring Harbor Laboratory 
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Press, Cold Spring Harbor, New York 1989, hereafter "Sambrook"). Two 
filters from each plate were probed using the degenerate consensus 
oligomer which had been end-labelled with (32)P using T4 polynucleotide 
kinase. The hybridizations were performed in a solution that contained high 
5 amounts of tetramethylammonium chloride in order to minimize differences 
in the melting temperatures of the oligomers that together comprise the 
degenerate consensus oligomer. The hybridization solution had the 
following composition: 3 M tetramethylammoniimi chloride, 10 mM sodium 
phosphate pH 6.8, 1.25 mM EDTA, 0.5% SDS, 0.5% milk. Hybridization 
10 was carried out overnight at a temperature of 44-C. Filters were then 
washed four times, 20 minutes each time, with 6 x SSC + 0.15% SDS at 
room temperature. Filters were then washed one time, for 30 minutes, with 
4 X SSC + 0.1% SDS at room temperature. The filters were then exposed to 
film for two days. 

15 The third set of filters that were made from each phage- 

containing plate were probed using DNA sequences from the three 
Arabidopsis desaturase genes that had already been identified: fad3, fadD 
and fadE. The fad3, fadD and fadE genes were labelled with (32)P and 
hybridized to the third set of phage filters in the following hybridization 

20 solution: 0.2 M NaCl, 20mM sodium phosphate pH 7.7, 2niM EDTA, 1% 
SDS, 0.5% milk, 10% dextran sulfate, 0.1% sodium pyrophosphate. 
Hybridization was carried out overnight at 65-C. Filters were washed four 
times, 30 minutes per time, in 2 x SSC + 0.15% SD at room temperature 
and then for 45 minutes with 1 x SSC + 0.1% SDS at 65^ C. The filters 

25 were then exposed to film for approximately two hotirs. 

The two sets of filters that were probed with the degenerate 
consensus oUgomer showed about 60 positive phage per plate (or about 180 
total positive phage). Results fi:-om the third set of filters that were probed 
with the fadS, fadD and fadE genes indicated that only a small percentage 
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of the phage that hybridized to the consensus of oligomer contained the 
fads, fadD or fadE genes. 

Seventy-six of the phage that hybridized to the consensus 
oligomer, but not to the fadS, fadD or fadE genes, were plaque purified. The 
5 purified phage were then spotted onto bacteria growing on solid media on 
plates and allowed to form plaques. Several duplicate filters were then 
made of these plates (Sambrook). One of these filters was probed with the 
consensus oligomer, as described above. A second filter was probed with a 
mixture of the Arabidopsis thaliana fadS, fadD and fadE genes, as 

10 described above. 

In order to determine which of the 76 phage contained the 
same cDNA inserts as which other phage, some of the filters were probed 
with cDNA inserts from some of the phage. In order to perform this 
experiment, the cDNA inserts fi-om most of the phage were isolated by 

15 using oligomers that bound to DNA flanking the cDNA cloning site in the 
phage vector to isolate the cDNA sequences using the polymerase chain 
reaction (PGR). These cDNA sequences were labelled with (32)P (random 
hexamer labelling) and hybridized to the filters using the following 
hybridization solution: 30% formamide, 0.2M NaCl, 20mM sodium 

20 phosphate pH 7.7, 2mM EDTA, 1% SDS, 0.5% milk, 0.1% sodium 
P3nrophosphate. The hybridizations were carried out for 14 hours at 65-C. 
The filters were washed four times 15 minutes per wash, with 2 x SSC + 
0.15% SDS at room temperature and were then exposed to film. 

The combination of the high formamide concentration in the 

25 hybridization solution and the high hybridization temperature meant that 
only DNA sequences that were virtually identical would hybridize, allowing 
us to distinguish between nearly identical sequences. Several rounds of 
hybridizations using cDNA inserts from different phage were carried out 
xmtil it had been determined which phage contained the same, or at least 
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extremely similar, cDNA inserts. On the basis of these experiments, we 
determined that all of the 76 phage contained one of four cDNA inserts. 
Sequence data was obtained from each of these fovir cDNAs. None of these 
cDNAs was foimd to be homologous to known desaturase genes, and so we 
5 feel that none of these four cDNAs is likely to encode a desaturase. 

Since the number of phage that hybridized to the consensus 
oligomer was quite high (about 180 phage hybridized in the initial screen 
described above), we were not able to analyze all of the positive phage in 
the initial experiments. So, an attempt was made to identify phage that 

10 hybridized to the consensus ohgomer but that did not contain the fadS, fadD 
of fadE genes or one of the four non-desaturase encoding clones that were 
identified in the first screen. In order to do this, between 500,000 and 
1,000,000 phage from the library described above were plated onto 10 
plates. Three filters were made from each plate (Sambrook). Two of these 

15 three sets of filters were then hybridized with (32) P labelled consensus 
oligomer as described above except that hybridization was carried out at 
42^0 instead of at 44»C. The third set of filters were hybridized with (32)P 
labelled DNA from the Arabidopsis fadS, fadD and fedE genes together with 
DNA from each of the four cDNA's identified in the first round of screening 

20 as hybridizing to the consensus oligomer but not encoding desaturases. 
This third set of filters were hybridized in: 30% formamide, 0.2 M NaCl, 
20mM sodium phosphate pH 7,7, 2mM EDTA, 1% SDA, 0.5% milk, 0.1% 
sodixim pyrophosphate at 65^C. All three sets of filters were hybridized for 
12 hours and then washed several times with 2 x SSC + 0.15% SDS at 

25 room temperature. The filters were then exposed to film. 

Approximately 200 phage from each plate hybridized to the 
consensus ohgomer. 50-60% of these phage also hybridized to fadS, fadD, 
fadE or to one of the four clones identified in the first screen. About 58 
phage that hybridized to the consensus oligomer, but not to fad3, fadD, 
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fadE or one of the four previously identified clones, were plaque purified. 
The puxified phage were then spotted onto a bacterial lawn growing on solid 
media on a petri plate and the phage were allowed to form plaques. Several 
filters were prepared firom these plates and hybridized with (32)P labelled 
5 cDNA inserts fi*om various of the newly purified phage, as described above. 
In this manner, all of the phage identified in this second round of screening 
were foxmd to contain one of eight different cDNA inserts. 

Sequence data was obtained from each of the eight cDNA's. 
One of the cDNA's, which was contained within only one of the phage, was 

10 found to have some sequence similarity of a known desaturase gene firom 
cyanobacteria, the DesA gene. Fxuther sequence information was obtained 
for this clone. This additional sequence showed very significant sequence 
similarity to the DesA gene, confirming that the clone contained a 
desaturase gene. The remainder of the cDNA contained within the clone 

15 was sequenced and compared with the sequences of other known 
desaturases. The new desaturase was 53.0% identical to DesA at the 
nucleotide level and 43.9%, 45.6% and 47.0% identical to B. napus fadS, 
Arabidopsis fadD and Arabidopsis fadE, respectively. As the gene 
contained within the clone was significantly more similar in sequence to the 

20 DesA gene (which is a delta-12 desattirase) than to fadS, fadD or fadE 
(which are omega-3 desaturases), the new desaturase was expected to be a 
delta-12 (= omega-6) desaturase. 

The additional sequence data also indicated that this new 
desaturase gene contains a region that has only a one base pair mismatch 

25 to the desaturase consensus sequence described above. This mismatch 
means that the new desaturase has the sequence GHDCAH instead of 
GHDCGH. 

A clone containing a full length cDNA for this gene was 
isolated and completely sequenced. This full length.cDNA was sub-cloned 
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into the plant transformation vector pBII121 such that the gene is 
transcribed under the control of the 35S promoter. This construct was 
then used to complement the phenotype of a fadC mutant {Plant Phys. 90: 
522-529, 1989) of Arabidopsis ^^Zuzna, indicating that the gene encodes a 
5 chloroplastic omega-6 desaturase. 

d Proposed j Rolation of fad2 

The most highly conserved peptide regions in the linoleic 
desaturases and the DesA desaturase were chosen as regions likely to be 
conserved in oleic desaturases. These 8 conserved regions are shown in 

10 TABLE 6, These regions were chosen on the following basis: These regions 
have areas highly conserved between the 3 linoleic desaturases and DesA, 
with at least 4 identical amino acids over a 10 amino add span. Once a 
region was identified as conserved, the fadS linoleic desaturase sequence 
was used as the amino acid sequence for the source of homology to identify 

15 oleic desaturases. This is because both fadS and the non-plastid oleic 
desaturases are thought to be localized to the endoplasmic reticulum and 
are most likely to contain similar amino add sequences. 

Several peptide endpoints in each conserved area were chosen 
as the basis to subsequently design oligonucleotide probes for identifying 

20 the oleic desaturase gene. The peptide endpoints were chosen to be 
between 5 and 9 amino adds in length. The peptide end points were chosen 
to end on the conserved (identical) amino adds, and most often to begin on 
conserved amino acids. The rationale is that within the larger conserved 
area, some amino add portions are more highly conserved than others, that 

25 15 to 27 (5 to 9 amino adds) nucleotides is a good primer size for PGR, and 
that for PGR it is important that the 3' end of the primer matches the 
target, with the conserved (identical) amino acids the most likely to be 
present in the oleic desaturases. These 28 ^'oleic desaturase" peptide 
targets (Table 6) are the basis oligonucleotides that are designed for 
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hybridizing to the oleic desaturase cDNA sequences to identify and isolate 
the oleic desaturase cDNA clone. 

Several possible methods for designing oligonucleotides and 
isolating the genes encoding the target peptide regions are known. For a 
5 discussion of designing degenerate oligonucleotides see PCR Protocols - A 
Guide to Methods and Applications, Eds M. A. Innis, D. H. Gelfand, J J 
Sninsky and T. J. White, Academic Press, San Diego, California, 1990; and 
Sanxzxzx The two most common screening methods using the 
oligonucleotides are screening cDNA libraries and PCR amplification of 

10 specific cDNAs. Gene probes from £ad3, fadD and fadE are used under 
stringent hybridization conditions to identify these cDNAs and discard 
them in the screen for oleic desaturase cDNA clones. The method for using 
degenerate oligonucleotides to screen a cDNA libraxy has been described in 
the example above demonstrating the isolation of the fadC oleic desaturase 

15 gene. An immature plant seed active in oil bios3mthesis, generally 2 to 5 
weeks after pollination, preferably about 3 to 4 weeks after pollination, of a 
plant such as Arabidopsis or canola is used as the source of mRNA for 
making cDNA First strand cDNA is made from the isolated mRNA and 
hybridized under stringent conditions in solution to an excess of biotinylated 

20 fadS, fadD and fadE cloned cDNAs. The hybrids and biotinylated nucleic 
acids are removed with strepavidin and a second roimd of substraction is 
done to remove any remaining fadS, fadD and fadE sequences. The cDNA 
remaining in solution is used for PCR reactions. (For 5* RACE, see below, a 
polyA tail is added to the first strand cDNA 3' end). 

25 A method that can readily evaluate a number of degenerate 

oligonucleotides probes is degenerate PCR (See chapters by Compton and 
by Lee and Caskey in PCR Protocols, cited above). In this method a 
degenerate set of oligonucleotides encompassing all the possible codon 
choices for the target peptide is synthesized (such degenerate 
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targets (Table 6) are the basis oligonucleotides that are designed for 
hybridizing to the oleic desaturase cDNA sequences to identify and isolate 
the oleic desaturase cDNA clone. 

Several possible methods for designing oligonucleotides and 
5 isolating the genes encoding the target peptide regions are known. For a 
discussion of designing degenerate oligonucleotides see PCR Protocols - A 
Guide to Methods and Applications, Eds M. A. Innis, D. H, Gelfand, J J 
Snins^ and T. J. White, Academic Press, San Diego, California, 1990; and 
Sambrook. The two most common screening methods using the 

10 ohgonucleotides are screening cDNA libraries and PCR amplification of 
specific cDNAs. Gene probes firom fadS, fadD and fadE are iised under 
stringent hybridization conditions to identify these cDNAs and discard 
them in the screen for oleic desaturase cDNA clones. The method for using 
degenerate oligonucleotides to screen a cDNA library has been described in 

15 the example above demonstrating the isolation of the fadC oleic desaturase 
gene. An immature plant seed active in oil biosynthesis, generally 1 to 5 
weeks after pollination, preferably about 2 to 4 weeks after pollination, of a 
plant such as Arabidopsis or canola is used as the source of mRNA for 
making cDNA. First strand cDNA is made from the isolated mRNA and 

20 hybridized under stringent conditions in solution to an excess of biotinylated 
fad3, fadD and fadE cloned cDNAs. The hybrids and biotinylated nucleic 
acids are removed with strepavidin and a second roamd of substraction is 
done to remove any remaining fadS, fadD and fadE sequences. The cDNA 
remaining in solution is used for PCR reactions. (For 5* RACE, see below, a 

25 polyA tail is added to the first strand cDNA 3' end). 

A method that can readily evaluate a nxunber of degenerate 
oligonucleotides probes is degenerate PCR (See chapters by Compton and 
by Lee and Caskey in PCR Protocols y cited above). In this method a 
degenerate set of oligonucleotides encompassing all the possible codon 
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5 



10 



15 



20 



25 



30 



35 



Peptide 

Peptide sequence 



TABLE? 
Targets for Fad2 



Cloning 



la 
lb 

Ic 

2a 

2b 

2c 

2d 

3a 

3b 

3c 

3d 

4a 

4b 

4c-l 

4c-2 

4d 

4e 

5a 

5b 

6a 

6b 

7a-l 

7a-2 

7b 

7c 

8a 

8b 

8c 

8d 

8e 



DIRAAIP 

AIPKHC 

KHCWVK 

WFLWPLYW 

WFLWP 

WPLYW 

WVAQGT 

FVLGHD 

VLGHDC 

GHDCGH 

CGHGSF 

PYHGW 

HGWRISH 

WRISHRTHH 

WRISH 
HRTHH 
ENDESW 
DESWVP 

WLDAVT 

TYLHH 

WSYLRGGL 

LTTIDRD 

TIDRDY 

HDIGTH 

HVIHHL 

HHLFPQI 

HLFPQIP 

LFPQIPHY 



Oligo sequence 5' - 3' 

GAYATHMGNGCNGCNATHCC 

GCNATHCCNAARCAYTG 

AARCAYTGYTGGGTNAA 

TGGTTYYTNTGGCCNYTNTAYTGG 

TGGTTYYTNTGGCCN 

TGGCCNYTNTAYTGG 

TGGGTNGCNCARGGNAC 

TTYGTNYTNGGNCAYGA 

GTNYTNGGNCAYGAYTG 

GGNCAYGAYTGYGGNCA 

TGYGGNCAYGGNWSNTT 

CCNTAYCAYGGNTGG 

CAYGGNTGGMGNATHWSNCA 

TGGMGNATHTCNC AYMGNACNC AYC A * 

TGGMGNATHAG YCAYMGNACNC AYCA * 

TGGMGNATHWSNCAY 

CAYMGNACNCAYCAY 

GARAAYGAYGARWSNTGG 

GAYGARWSNTGGGTNCC 

NGTNACNGCRTCNARCCA 

RTGRTGNARRTANGT 

ARNCCNCCNCKNARRTARCTCCA * 

ARNCCNCCNCKNARRTANGACCA * 

RTCNCKRTCDATNGTNGTNA 

RTARTCNCKRTCDATNGT 

RTGNGTNCCDATRTCRTG 

NARRTGRTGDATNACRTG 

DATYTGNGGRAANARRTGRTG 

GGDATYTGNGGRAANARRTG 

RTARTGNGGDATYTGNGGRAANA 



* synthesize 4c and 7a in two pools each to limit the 
40 degeneracy 

Oligos for 6a - 8e are the complement of the coding 
sequence 
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10 



15 



20 



25 



30 



35 



40 



TABLES 

Table of Oligomers for PGR RA.CGB of £ad2 

Peptide # Oligo Length Fold Similarity Similarity in 

Degeneracy with L26296 Last 10 n.t. 



la 


20 


384 


75 % 


80 ^ 


lb 


17 


192 


88 


80 


Ic 


17 


32 


65 


80 


2a 


24 


64 


79 


100 




15 


48 


73 


80 


2c 


15 


48 


100 


100 


2d 


17 


128 


76 


90 


oa 


1 7 


384 


76 


70 


3b 


17 


384 


82 


80 


3c 


17 


128 


88 


90 


3d 


17 


384 


82 


70 


4a 


15 


64 


80 


70 


4b 


20 


192 


75 


90 


4c 


26 


96* 


81 


80 


4d 


15 


216 


87 


90 


4e 


15 


192 


87 


80 


5a 


18 


96 


72 


80 


5b 


17 


96 


76 


80 


6a 


18 


256 


78 


80 


6b 


15 


192 


93 


100 


7a 


23 


256* 


78 


60 


7b 


20 


384 


90 


80 


7c 


18 


192 


94 


90 


8a 


18 


384 


72 


70 


8b 


18 


192 


89 


80 


8c 


21 


384 


81 


100 


8d 


20 


192 


80 


90 


8e 


23 


192 


83 


70 



* done in two oligo pools 
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Table 7 shows the 28 peptide targets from the eight conserved 
regions and the 30 degenerate oligonucleotides derived from the peptide 
sequences. The degeneracy was kept to less than 516 fold, for those 
instances where more degeneracy occurred, by the use of deoxjdnosine 
5 CSambrook et al.) and by not including the last nucleotide in the last codon, 
and in two cases by the use of two subpools. Table 8 shows the amoimt of 
degeneracy for each designed oligonucleotide sequence and the amount of 
homology of the oligonucleotides to the Arabidopsis oleic desaturase fad2 
(Accession No. L26296). Also shown in Table 8 is the percent homology in 

10 the last 10 nucleotides on the 3' end of each primer, since this region is most 
important for annealing and elongation under PGR conditions. It is 
expected that both 10 of 10 and 9 of 10 homology matches, and probably 8 
of 10 homology matches in the 3' primer regions will serve as eflBdent PGR 
primers. Note that for oligonucleotide sets la through 5b (for 3* RACE) the 

15 strand direction is the same as the mRNA while for oligonucleotide sets 6a 
through 8e (for 5' RAGE) the direction is opposite of the mRNA, Four 
oligonucleotides have a 10 of 10 match in the 3' position, 6 oligonucleotides 
match 9 of 10 in the 3' position and 12 match in 8 of 10 nucleotides in the 3' 
position. Oligonucleotides corresponding to peptides 2a, 2c, 2d, 3c, 4b, 4d, 

20 6b, 7c, 8c, and 8d show 90% or greater homology in their last 10 
nucleotides and anneal to the oleic desaturase gene and serve as primers to 
this gene. This demonstrates the validity of using the conserved regions of 
the plant linoleic desaturases and DesA to identify and isolate plant oleic 
desaturases. 

25 The first round of PGR products are subjected to two rounds of 

subtraction using biotinylated fadS, fadD and fadE cloned cDNA to remove 
any hybridizing fadS, fadD and fadE sequences with strepavidin. This 
subtracted DNA is greatly enriched for fad2 sequences and depleted of fad3, 
fadD and fadE sequences. These 30 samples are run on agarose gels. 
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blotted and hybridized with pools of probe from the 30 samples. Pools of 5 
of each of the 30 PGR samples are labeled with random primers and 
hybridized to the blots of the 30 samples, for a total of 6 blots hybridized 
with 6 pools of 5 probes. Additionally, a pool of fadS, fadD and fadE probe is 
5 hybridized to a duplicate blot. Bands that do not hybridize strongly to fad3, 
fadD and fadE but do cross hybridize to probe made from a different sample 
are strong candidates for fad2 as fad2 is likely to be the only DNA amplified 
in two or more independent PGR reactions. Positively hybridizing lanes 
identify samples to amplify by PGR using the same primers as in the initial 

10 reaction for 5 tolO cycles and the PGR products are cloned into plasmid 
vectors. The same probe that recognized the sample on the blot is used to 
screen the library and identify the hybridizing clone. Positive clones are 
sequenced and identified as fad2 clones by their homology but non-identity 
with fads, and fiirther characterized as described below. 

15 In the event that fad2 sequences are not sufficiently enriched 

in one round of PGR to be identified, a second round of PGR is performed. If 
the lack of detection is due to insufficient amplification of fad2, then 
another round of PGR using the same primers on the subtracted PGR first 
round samples and the same simple screen as described above will identify 

20 fad2. If there are too many competing non-specific reactions then a second 
round of PGR using a different primer combination will remove non-specific 
amplifications and enrich for fad2. To further enrich for fad2 sequences 
each of the initial 30 PGR samples (one for each oligonucleotide in Table 7) 
after subtraction as described above, is subjected to a second roimd of PGR 

25 reactions using a different primer combination than the first reaction. One 
of the primers woxild be the same degenerate oligonucleotide primer as in 
the first PGR reaction. The second primer would now be from one of the 30 
primers in Table 7 from the opposite class, ie, primers from la to 5b form 
matched sets with primers from 6a to 8e (primers la to 5b are in the sense 
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direction while primers 6a to 8e are in the antisense direction). For 
example, if oligonucleotide la was used initially, it is used again as one of 
the two primers and the second primer is each of the 6a to 8e 
oligonucleotides for a total of 11 separate PGR reactions. In total the 30 
5 initial reactions result in 418 second cycle PGR reactions, a number easily 
handled by PGR technology. Essentially this second PGR cycle 
accomplishes a ^nested** or sequential PGR reaction step after removing all 
the linoleic desaturases by the subtraction step. This increases the 
amplification as well as the specificity. Identification of samples containing 

10 £ad2 are performed similarly as described above, with the 418 samples dot 
blotted onto 22 filters and probed with 21 pools of 20 samples and with a 
pool of fads, fadD and fadE. Again, any sample that cross hybridizes with 
an independent probe sample and does not hybridize to fadS, fadD and fadE 
is a candidate for containing fad2 in the sample. If fadS, fadD and fadE 

15 hybridization is still present, another biotinylation/stepavidin subtraction 
should remove it. Positively hybridizing samples are run on gels, the band 
identified by hybridization and isolated for cloning. This second set of PGR 
reactions produces PGR products of a predictable size since both primers 
are within the coding region where little variation in size is expected. Thus 

20 the presence of a band of the expected size on a gel is diagnostic of fad2, 
particularly if hybridization of a blot of such a gel with a fad3, fadD and 
fadE probe indicates the band is not due to fadS, fadD and fadE 
contamination. After cloning the inserts in E. coli, the resulting plasmids 
containing the insert are identified by hybridization. They are sequenced 

25 and identified as oleic desaturases by their homology but non-identity with 
the linoleic desaturases, as in the examples described previously. The full 
length done of these cDNAs is obtained by standard methods and inserted 
into plant gene expression and transformation vectors and transformed 
into Arabidopsis fad2 mutants to confirm the identity of the oleic 
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desatiirase by genetic complemention as was described in the example with 
linoleic desaturase. 

Thus in this approach to isolating the plant oleic desaturases, 
the total nximber of peptide regions is 8, comprised of 28 smaller peptide 
5 targets. This leads to set of 30 degenerate oligonucleotides, that are used in 
the PGR amplification and screening of the PGR products. Subtraction of 
interfering fad3, fadD and £adE sequences is used at several points. If 
necessary a second round of PGR reactions with paired internal primers 
gives extra amplification and specificity. This approach identifies the plant 

10 oleic desaturases, and the sequence of the isolated clones should confirm 
their identity by their homology to the plant linoleic desaturases as 
described. Thus a defined approach to isolating the plant oleic desaturases 
fi-om the information about linoleic desaturases is presented here. The 
example given here is for Arabidopsis or canola oleic desaturases, but the 

15 approach is not limited to those plants as the oleic desaturases are 
probably highly conserved in most plants. Thus once one plant oleic 
desaturase is isolated, the sequence information is used to isolate the genes 
fi*om other plant species by direct hybridization or by an approach similar 
to the one described here. 

20 
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SBQUENCE LISTZN 



(1) GEMEIUL IKFOKM21TION: 

(i) APPLICANT: 

(A) NAME; Monsanno Company 

(B) STREET: 800 North Lindbergh Boulevard 

(C) CZTY: St. Louie 

(D) STATE: Miaeouri 

(E) COUNTRY: United States of America . 

(F) POSTAL CODE <ZIP): 63167 

(G) TELEPHONE: (314)694-3131 

(H) TELEFAX: (314)694-5435 

(ii) TITLE OF INVENTICHI: Altered Linolenic and Linoleic Acid Content 
in Plants 

(iii) NUMBER OF SEQUENCES: 72 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentin Release #1.0, Version #1.25 (EPO) 

(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/156551 

(B) FILING DATE: 22-NOV-1993 

(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/014431 

(B) FILING DATE: 05-PEB-1993 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1353 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 87 1238 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
AATCCATCAA ACCTTTATTC ACCACATTTC ACTGAAAGGC CACACATCTA GAGAGA6AAA 
CTTCGTCCAA ATCTCTCTCT CCAGCG ATG GTT GTT GCT ATG GAC CAG CGC AGC 
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Met Val Val Ala Met Aep Gin Arg ser 
1 5 

AAT GTT AAC GGA GAT TCC GGT GCC OGG AA6 GAA 6AA 6GG TTT GAT CCA 161 
Aan Val Aan Gly Aep Ser Gly Ala Arg Lys Glu Glu Gly Phe Aep Prp 
10 15 20 25 

A6C 6CA CAA CCA CCG TTT AA6 ATC GGA GAT ATA AGG GCG GOG ATT CCT 209 
Ser Ala Gin Pro Pro Phe Lye lie Gly Asp lie Arg Ala Ala Zle Pro 
30 35 40 

AAG CAT TGC TGG GTG AAG AGT CCT TTG AGA TCT ATG AGC TAC 6TC ACC 257 
Lye Hie Cye Trp Val Lye Ser Pro Leu Arg Ser Met Ser Tyr Val Thr 
45 50 55 

AGA GAC ATT TTC GCC GTC GCG GCT CTG GCC ATG GCC GCC GTG TAT TTT 305 
Arg Aep lie Phe Ala Val Ala Ala Leu Ala Met Ala Ala Val Tyr Phe 
60 65 70 

GAT AGC TGG TTC CTC TGG CCA CTC TAC TGG GTT GCC CAA GGA ACC CTT 353 
Aep Ser Trp Phe Leu Trp Pro Leu Tyr Trp Val Ala Gin Gly Thr Leu 
75 SO 85 

TTC TGG GCC ATC TTC GTT CTT GGC CAC GAC TGT GGA CAT GGG AGT TTC 401 
Phe Trp Ala lie Phe val Leu Gly Hie Aep Cys Gly His Gly Ser Phe 
90 95 100 105 

TCA GAC ATT CCT CTG CTG AAC AGT GTG GTT GGT CAC ATT CTT CAT TCA 449 
Ser Aep He Pro Leu Leu Aen Ser Val Val Gly Hie Zle Leu Hie Ser 
110 115 120 

TTC ATC CTC GTT CCT TAC CAT GGT TGG AGA ATA AGC CAT CGG ACA CAC 497 
Phe Zle Leu Val Pro Tyr Hie Gly Trp Arg Zle Ser His Arg Thr Hie 
125 130 135 

CAC CAG AAC CAT GGC CAT GTT GAA AAC GAC GAG TCT TGG GTT CCG TTG 545 
His Gin Aen Hie Gly Hie Val Glu Aen Aep Glu Ser Trp Val Pro Leu 
140 145 150 

CCA GAA AAG TTG TAC AAG AAC TTG CCC CAT AGT ACT CGG ATG CTC AGA 593 
Pro Glu Lye Leu Tyr Lye Aen Leu Pro His Ser Thr Arg Met lieu Arg 
155 160 165 

TAC ACT GTC CCT CTG CCC ATG CTC GCT TAC CCG ATC TAT CTG TGG TAC 641 
Tyr Thr Val Pro Leu Pro Met Leu Ala Tyr Pro Zle Tyr Leu Trp Tyr 
170 175 180 185 

AGA AGT CCT GGA AAA GAA GGG TCA CAT TTT AAC CCA TAC AGT AGT TTA 689 
Arg Ser Pro Gly Lye Glu Gly Ser Hie Phe Aen Pro Tyr Ser Ser Z«eu 
190 195 200 

TTT GCT CCA AGC GAG AGG AAG CTT ATT GCA ACT TCA ACT ACT TGC TGG 737 
Phe Ala Pro Ser Glu Arg Lys Leu Zle Ala Thr Ser Thr Thr Cys Trp 
205 210 215 

TCC ATA ATG TTG GCC ACT CTT GTT TAT CTA TCG TTC CTC GTT GAT CCA 785 
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Ser XXe Met X*eu Ala Thr liOu Val Tyr Leu Ser Phe Leu Val Aap Pro 
220 225 230 

6TC AGA 6TT CTC AAA 6TC TAT GGC GTT CCT TAG ATT ATC TTT 6T6 ATG 833 
Val Thr val Leu Lye Val Tyr Gly Val Pro Tyr lie lie Phe Val Met 
235 240 245 

TGG TT6 6AC GCT GTC AOG TAG TTG CAT CAT CAT GGT GAG GAT GAG AAG 881 
Trp Leu Asp Ala Val Thr Tyr Leu Hie His His Gly His Asp Glu Lys 
250 255 260 265 

TTG CCT TGG TAC AGA GGC AAG GAA TGG AGT TAT TTA GGT GGA GGA TTA 929 
Leu Pro Trp Tyr Arg Gly Lys Glu Trp Ser Tyr Leu Arg Gly Gly Leu 
270 275 280 

AGA ACT ATT GAT AGA GAT TAC GGA ATC TTG AAG AAG ATG CAT CAC GAG 977 
Thr Thr He Asp Arg Asp Tyr Gly lie Phe Asn Asn He His His Asp 
285 290 295 

ATT GGA ACT CAC GTG ATC CAT CAT GTT TTG CCA GAA ATC CCT CAC TAT 1025 
He Gly Thr His Val He His His Leu Phe Pro Gin He Pro His Tyr 
300 305 310 

CAC TTG GTC GAT GCC ACG AGA GGA GCT AAA CAT GTG TTA GGA AGA TAC 1073 
His Leu Val Asp Ala Thr Arg Ala Ala Lys His Val Leu Gly Arg Tyr 
315 320 325 

TAC AGA GAG CCG AAG ACG TCA GGA GCA ATA CCG ATT CAC TTG GTG GAG 1121 
Tyr Arg Glu Pro Lys Thr Ser Gly Ala He Pro He His Leu Val Glu 
330 335 340 345 

AGT TTG GTC GCA AGT ATT AAA AAA GAT CAT TAC GTC AGT GAG ACT GGT 1169 
Ser Leu Val Ala Ser He Lys Lys Asp His Tyr Val Ser Asp Thr Gly 
350 355 360 

GAT ATT GTC TTC TAC GAG ACA GAT CCA GAT CTC TAC GTT TAT GCT TCT 1217 
Asp He Val Phe Tyr Glu Thr Asp Pro Asp Leu Tyr Val Tyr Ala Ser 
365 370 375 

GAG AAA TCT AAA ATC AAT TAA CT TTTCT TCCTAGCTCT ATTAGGAATA 1265 
Asp Lys Ser Lys He Asn 

380 

AACACTCCTT GTCTTTTACT TATTTGTTTC TGCTTTAAGT TTAAAATGTA CTCGTGAAAC 1325 
TA TTAATGTATT TAGGTTAC 1353 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 383 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xl) SEQUENCE DESCRZPTI N: SEQ ID NO: 2: 

Met Val Val Ala Met Asp Gin Arg Ser Asn Val Asn Gly Asp Ser Gly 
1 5 10 15 

Ala Arg Lys Glu Glu Gly Phe Asp Pro Ser Ala Gin Pro Pro Phe Lys 
20 25 30 

lie Gly Asp He Arg Ala Ala He Pro Lys His Cys Trp Val Lys Ser 
35 40 45 

Pro Leu Arg Ser Met Ser Tyr Val Thr Arg Asp He Phe Ala Val Ala 
50 55 60 

Ala Leu Ala Met Ala Ala Val Tyr Phe Aep Ser Trp Phe Leu Trp Pro 
65 70 75 80 

Leu Tyr Trp Val Ala Gin Gly Thr Leu Phe Trp Ala lie Phe Val Leu 
85 90 95 

Gly His Asp Cys Gly His Gly Ser Phe Ser Asp lie Pro Leu Leu Asn 
100 105 110 

Ser Val Val Gly His Zle Leu His Ser Phe He Leu Val Pro Tyr His 
115 120 125 

Gly Trp Arg He Ser His Arg Thr His His Gin Asn His Gly His Val 
130 135 140 

Glu Asn Asp Glu Ser Trp Val Pro Leu Pro Glu Lys Leu Tyr Lys Asn 
145 150 155 160 

Leu Pro His Ser Thr Arg Met Leu Arg Tyr Thr Val Pro Leu Pro Met 
165 170 175 

Leu Ala Tyr Pro He Tyr Leu Trp Tyr Arg Ser Pro Gly Lys Glu Gly 
180 185 190 

Ser His Phe Asn Pro Tyr Ser Ser Leu Phe Ala Pro Ser Glu Arg Lys 
195 200 205 

Leu He Ala Thr Ser Thr Thr Cys Trp Ser He Met Leu Ala Thr Leu 
210 215 220 

Val Tyr Leu Ser Phe Leu Val Asp Pro Val Thr Val Leu Lya Val Tyr 
225 230 235 240 

Gly Val Pro Tyr He He Phe Val Met Trp Leu Asp Ala Val Thr Tyr 
245 250 255 

Leu His His His Gly His Asp Glu Lys Leu Pro Trp Tyr Arg Gly Lys 
260 265 270 

Glu Trp Ser Tyr Leu Arg Gly Gly Leu Thr Thr He Asp Arg Asp Tyr 
275 280 285 
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Gly lie Phe Han Aan lie His His Asp lie Gly Thr His Val lie His 
290 295 300 

His Leu Phe Pro Gin lie Pro His Tyr His Leu Val Aap Ala Thr Arg 
305 310 315 320 

Ala Ala Lys His Val Leu Gly Arg Tyr Tyr Arg Glu Pro Lys Thr Ser 
325 330 335 

Gly Ala Zle Pro lie His Leu Val Glu Ser Leu Val Ala Ser lie Lys 
340 345 350 

Lys Asp His Tyr Val Ser Asp Thr Gly Asp lie Val Phe Tyr Glu Thr 
355 360 365 

Asp Pro Asp Leu Tyr Val Tyr Ala Ser Asp Lys Ser Lys Zle Asn 
370 375 380 



(2) ZHFOXUfATZON FOR SEQ ID N0:3: 

<i) SEQUENCE CHARACTERISTICS s 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
GG06ATGCTG TCGGAATG6A OGATA 25 
(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
CTTGGAGCCA CTATCGACTA CGCGATC 27 
(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESSs Single 

(D) TOPQLOOY: linear 

(ii) HOLBCULB TYPE: CDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
CC6ATCTCAA GATTACGGAA T 21 
(2) INPORMATZON PGR SEQ ID NO»6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) Z«ENGTH: 24 base paire 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
TTCCTAATGC A6GAGT06CA TAAG 24 
(2) INPORMATION POR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairB 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
AGGAGTC6CA TAAG6GAG 18 
(2) INPORMATION POR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
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GGGAA6TGAA TGGAGAC 17 
(2) INFORK21TION FOR SEQ ID NOs9l 

(1) SEQX7ENCE CHARACTERISTICS: 

(A) LENGTH: 1645 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: doiible 

(D) TOPOLOGYs linear 

(ii) MOLECUX^ TYPE: CDHA 

(ix) FEATURE: 

(A) KAME/KEY: CDS 

(B) LOCATION: 125. .1465 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

GGAAAACACA AGTTTCTCTC ACACACATTA TCTCTTTCTC TATTACCACC ACTCATTCAT 60 

AACAGAAACC CACCAAAAAA TAAAAAGA6A GACTTTTCAC TCTGGGGAGA GAGCTCAAGT 120 

TCTA AT6 6CG AAC TTG 6TC TTA TCA 6AA T6T GGT ATA C6A CCT CTC CCC 169 
Met Ala Asn Leu Val Leu Ser Glu Cys Gly lie Arg Pro Leu Pro 
15 10 15 

AGA ATC TAG ACA ACA CCC AGA TCC AAT TTC CTC TCC AAC AAC AAC AAA 217 
Arg lie Tyr Thr Thr Pro Arg Ser Asn Phe Leu Ser Asn Asn Asn Lys 
20 25 30 

TTC AGA CCA TCA CTT TCT TCT TCT TCT TAC AAA ACA TCA TCA TCT CCT 265 
Phe Arg Pro Ser Leu Ser Ser Ser Ser Tyr Lys Thr Ser Ser Ser Pro 
35 40 45 

CTG TCT TTT GGT CTG AAT TCA CGA GAT GGG TTC ACG AGG AAT TGG GCG 313 
Leu Ser Phe Gly Leu Asn Ser Arg Asp Gly Phe Thr Arig Asn Trp Ala 
50 55 60 

TTG AAT GTG AGC ACA CCA TTA ACG ACA CCA ATA TTT GAG GAG TCT CCA 361 
Leu Asn Val Ser Thr Pro Leu Thr Thr Pro lie Phe Glu Glu Ser Pro 
65 70 75 

TTG GAG GAA GAT AAT AAA CA6 AGA TTC GAT CCA GGT GCG CCT CCT COG 409 
Leu Glu Glu Asp Asn Lys Gin Arg Phe Asp Pro Gly Ala Pro Pro Pro 
80 85 90 95 

TTC AAT TTA GCT GAT ATT AGA GCA GCT ATA CCT AAG CAT TGT TGG GTT 457 
Phe Asn Leu Ala Asp lie Arg Ala Ala lie Pro Lys His Cys Trp Val 
100 105 110 

AAG AAT CCA TGG AAG TCT TTG AGT TAT GTC GTC AGA GAC GTC GCT ATC 505 
Lys Asn Pro Trp Lys Ser Leu Ser Tyr Val Val Arg Asp Val Ala lie 
115 120 125 
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6TC TTT 6CA TTG 6CT 6CT G6A 6CT 6CT TAC CTC AAC AAT TGG ATT 6TT 553 
Val Ph Ala Leu Ala Ala Gly Ala Ala Tyr Z«eu Asn Aan Trp lie Val 
130 135 140 

TGG CCT CTC TAT TGG CTC GCT CAA GGA ACC AT6 TTT TGG GCT CTC TTT 601 
Trp Pro Leu Tyr Trp Leu Ala Gin Gly Thr Met Phe Trp Ala Leu Phe 
145 ISO 155 

6TT CTT GGT CAT GAC TGT GGA CAT GGT AGT TTC TCA AAT GAT CCG AAG 649 
Val Leu Gly His Asp Cys Gly His Gly Ser Phe Ser Asn Asp Pro Lys 
160 165 170 175 

TTG AAC AGT GTG GTC GGT CAT CTT CTT CAT TCC TCA ATT CTG 6TC CCA 697 
Leu Asn Ser Val Val Gly His Leu Leu His Ser Ser He Leu Val Pro 
180 1S5 190 

TAC CAT GGC TGG AGA ATT AGT CAC AGA ACT CAC CAC CAG AAC CAT GGA 745 
Tyr His Gly Trp Arg He Ser His Arg Thr His His Gin Asn His Gly 
195 200 205 

CAT GTT GAG AAT GAC GAA TCT TGG CAT CCT ATG TCT GAG AAA ATC TAC 793 
His Val Glu Asn Asp Glu Ser Trp His Pro Met Ser Glu Lys He Tyr 
210 215 220 

AAT ACT TTG GAC AAG CCG ACT AGA TTC TTT AGA TTT ACA CTG CCT CTC 841 
Asn Thr Leu Asp Lys Pro Thr Arg Phe Phe Arg Phe Thr Leu Pro Leu 
225 230 235 

GTG ATG CTT GCA TAC CCT TTC TAC TTG TGG GCT CGA AGT CCG GGG AAA 889 
Val Met Leu Ala Tyr Pro Phe Tyr Leu Trp Ala Arg Ser Pro Gly Lys 
240 245 250 255 

AAG GGT TCT CAT TAC CAT CCA GAC AGT GAC TTG TTC CTC CCT AAA GAG 937 
Lys Gly Ser His Tyr His Pro Asp Ser Asp Leu Phe Leu Pro Lys Glu 
260 265 270 

AGA AAG GAT GTC CTC ACT TCT ACT GCT TGT TGG ACT GCA ATG GCT GCT 985 
Arg Lys Asp Val Leu Thr Ser Thr Ala Cys Trp Thr Ala Met Ala Ala 
275 280 285 

CTG CTT GTT TGT CTC AAC TTC ACA ATC GGT CCA ATT CAA ATG CTC AAA 1033 
Leu Leu Val Cys Leu Asn Phe Thr He Gly Pro He Gin Met Leu Lys 
290 295 300 

CTT TAT GGA ATT CCT TAC TGG ATA AAT GTA ATG TGG TTG GAC TTT GTG 1081 
Leu Tyr Gly He Pro Tyr Trp He Asn Val Met Trp Leu Asp Phe Val 
305 310 3i5 

ACT TAC CTG CAT CAC CAT GGT CAT GAA GAT AAG CTT CCT TGG TAC CGT 1129 
Thr Tyr Leu His His His Gly His Glu Asp Lys Leu Pro Trp Tyr Arg 
320 325 330 335 

GGC AAG GAG TGG AGT TAC CTG AGA GGA GGA CTT ACA ACA TTG GAT CGT 1177 
Gly Lys Glu Trp Ser Tyr Leu Arg Gly Gly Leu Thr Thr Leu Asp Arg 
340 345 350 
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GAC TAG GGA TTG ATC AAT AAC ATC CAT CAT GAT ATT CCA ACT CAT GTG 1225 
ASP Tyr Gly Leu lie Asn Aan He His His Asp He Gly Thr His Val 
355 360 365 

ATA CAT CAT CTT TTC CCG CAG ATC CCA CAT TAT CAT CTA GTA GAA GCA 1273 
He His His Leu Phe Pro Gin He Pro His Tyr His Leu Val Glu Ala 
370 375 3B0 

ACA GAA GCA 6CT AAA CCA GTA TTA GGG AAG TAT TAC AGG GAG CCT GAT 1321 
Thr Glu Ala Ala Lys Pro Val Leu Gly Lys Tyr Tyr Arg Glu Pro Asp 
385 390 395 

AAG TCT GGA CCG TTG CCA TTA CAT TTA CTG GAA ATT CTA GCG AAA AGT 1369 
Lys Ser Gly Pro Leu Pro Leu His Leu Leu Glu He Leu Ala Lys Ser 
400 405 410 415 

ATA AAA GAA GAT CAT TAC GTG AGC GAC GAA GGA GAA GTT GTA TAC TAT 1417 
He Lys Glu Asp His Tyr Val Ser Asp Glu Gly Glu Val Val Tyr Tyr 
420 425 430 

AAA GCA GAT CCA AAT CTC TAT GGA GAG GTC AAA GTA AGA GCA GAT TGAAATGAAG 
1472 

Lys Ala Asp Pro Asn Leu Tyr Gly Glu Val Lys Val Arg Ala Asp 
435 440 445 

CAGGCTTGAG ATTOAAGTTT TTTCTATTTC AGACCAGCTG ATTTTTTGCT TACTGTATCA 1532 

ATTTATTGTG TCACCCACCA GAGA6TTAGT ATCTCTGAAT ACGATCGATC AGATG6AAAC 1592 

AACAAATTTG TTTGC6ATAC TGAAGCTATA TATACCATAA AAAAAAAAAA AAA 1645 



(2) ZNFORMATIOM FOR SEQ ZD NO: 10s 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 446 amino acids 

(B) TYPE: amino acid 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: prot;ein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Net Ala Asn Leu Val Leu Ser Glu Cys Gly He Arg Pro Leu Pro Arg 
15 10 15 

He Tyr Thr Thr Pro Arg Ser Asn Phe Leu Ser Asn Asn Asn Lys Phe 
20 25 30 

Arg Pro Ser Leu Ser Ser Ser Ser Tyr Lys Thr Ser Ser Ser Pro Leu 
35 40 45 

Ser Phe Gly Leu Asn Ser Arg Asp Gly Phe Thr Arg Asn Trp Ala Leu 

50 55 60 

Asn Val Ser Thr Pro Leu Thr Thr Pro He Phe Glu Glu Ser Pro Leu 
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65 70 75 80 

Glu Glu Asp Aon Lye Gin Arg Phe Asp Pro Gly Ala Pr Pro Pro Phe 
85 90 95 

Asn Leu Ala Asp He Arg Ala Ala He Pro Lys His Cys Trp Val Lys 
100 105 110 

Asn Pro Trp Lys Ser Leu Ser Tyr Val Val Arg Asp Val Ala He Val 
115 120 125 

Phe Ala Leu Ala Ala Gly Ala Ala Tyr Leu Asn Asn Trp He Val Trp 
130 135 140 

Pro Leu Tyr Trp Leu Ala Gin Gly Thr Met Phe Trp Ala Leu Phe Val 
145 ISO 155 160 

Leu Gly His Asp Cys Gly His Gly Ser Phe Ser Asn Asp Pro Lys Leu 
165 170 175 

Asn Ser Val Val Gly His Leu Leu His Ser Ser He Leu Val Pro Tyr 
180 185 190 

His Gly Trp Arg He Ser His Arg Thr His His Gin Asn His Gly His 
195 200 205 

Val Glu Asn Asp Glu Ser Trp His Pro Met Ser Glu Lys He Tyr Asn 
210 215 220 

Thr Leu Asp Lys Pro Thr Arg Phe Phe Arg Phe Thr Leu Pro Leu Val 
225 230 235 240 

Met Leu Ala Tyr Pro Phe Tyr Leu Trp Ala Arg Ser Pro Oly Lye Lys 
245 250 255 

Gly Ser His Tyr His Pro Asp Ser Asp Leu Phe Leu Pro Lys Glu Arg 
260 265 270 

Lys Asp Val Leu Thr Ser Thr Ala Cys Trp Thr Ala Met Ala Ala Leu 
275 280 285 

Leu Val Cys Leu Asn Phe Thr He Gly Pro He Gin Met Leu Lys Leu 
290 295 300 

Tyr Gly He Pro Tyr Trp He Asn Val Met Trp Leu Asp Phe Val Thr 

305 310 315 320 

Tyr Leu His His His Gly His Glu Asp Lys Leu Pro Trp Tyr Arg Gly 
325 330 335 

Lys Glu Trp Ser Tyr Leu Arg Gly Gly Leu Thr Thr Leu Asp Arg Asp 
340 345 350 

Tyr Gly Leu He Asn Asn He His His Asp He Gly Thr His Val He 
355 360 365 
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Hie HlB Leu Ph Pro Gin II Pro Bis Tyr His Leu Val Glu Ala Thr 
370 375 380 

Glu Ala Ala Lye Pro Val Leu Gly Lye Tyr Tyr Arg Glu Pro Asp Lys 
385 390 395 400 

Ser Gly Pro Leu Pro Leu His Leu Leu Glu lie Leu Ala Lye Ser lie 
405 410 415 

Lys Glu Asp His Tyr Val Ser Asp Glu Gly Glu Val Val Tyr Tyr Lys 
420 425 430 

Ala Asp Pro Asn Leu Tyr Gly Glu Val Lys Val Arg Ala Asp 
435 440 445 

(2) INFORMATION FOR SEQ ID NO: 11$ 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1525 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(ix) FEATURES 

(A) NAME/KEY: CDS 

(B) LOCATION: 61. .1368 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

AGAGAGTGCA AATAGAACGA CAGAGACTTT TTCCTCTTTT CTTCTTGGGA AGAGGCTCCA 60 

ATG GCG AGC TCG GTT TTA TCA GAA TGT GGT TTT AGA CCT CTC CCC AGA 108 
Met Ala Ser Ser Val Leu Ser Glu Cys Gly Phe Arg Pro Leu Pro Arg 
15 10 15 

TTC TAC CCT AAA CAC ACA ACC TCT TTT GCC TCT AAC CCT AAA CCC ACT 156 
Phe Tyr Pro Lys His Thr Thr Ser Phe Ala Ser Asn Pro Lys Pro Thr 
20 25 30 

TTC AAA TTC AAT CCA CCA CTT AAA CCT CCT TCT TCT CTT CTC AAT TCC 204 
Phe Lys Phe Asn Pro Pro Leu Lys Pro Pro Ser Ser Leu Leu Asn Ser 
35 40 45 

CGA TAT GGA TTC TAC TCT AAA ACC AGG AAC TGG GCA TTG AAT GTG GCA 252 
Arg Tyr Gly Phe Tyr Ser Lys Thr Arg Asn Trp Ala Leu Asn Val Ala 
50 55 60 

ACA CCT TTA ACA ACT CTT CAG TCT CCA TCC GAG GAA GAC ACG GAG AGA 300 
Thr Pro Leu Thr Thr lieu Gin Ser Pro Ser Glu Glu Asp Thr Glu Arg 
65 70 75 80 
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TTC GAC CCA 66T GOG COT CCT COC TTC AAT TT6 GC6 GAT ATA AGA 6CA 3 
Phe ABp Pr Gly Ala Pro Pro Pro Phe Asn Leu Ala Aep lie Arg Ala 
85 90 95 

GCC ATA CCT AAG CAT TGT TGG GTT AAG AAT CCA TGG ATG TCT ATG A6T 3 
Ala lie Pro Lye His Cyo Trp Val Lye Aan Pro Trp Met Ser Met Sor 
100 105 110 

TAT GTT GTC AGA GAT GTT GCT ATC GTC TTT GGA TTG GCT GCT GTT GCT 4 
Tyr Val Val Arg Asp Val Ala lie Val Phe Gly Leu Ala Ala Val Ala 
115 120 125 

GCT TAC TTC AAC AAT TGG CTT CTC TGG CCT CTC TAC TGG TTC GCT CAA 4 
Ala Tyr Phe Asn Asn Trp Leu Leu Trp Pro Leu Tyr Trp Phe Ala Gin 
130 135 140 

GGA ACC ATG TTC TGG GCT CTC TTT GTC CTT GGC CAT GAC TGC GGA CAT 5 
Gly Thr Met Phe Trp Ala Leu Phe Val Leu Gly His Asp Cys Gly His 
145 150 155 160 

GGT AGC TTC TCG AAT GAT CCG AGG CTG AAC AGT GTG GCT GGT CAT CTT 5 
Gly Ser Phe Ser Asn Asp Pro Arg Leu Asn Ser Val Ala Gly His Leu 
165 170 175 

CTT CAT TCC TCA ATT CTG GTC CCT TAC CAT GGC TGG AGG ATT AGC CAC 6 
Leu His Ser Ser lie Leu Val Pro Tyr His Gly Trp Arg lie Ser His 
180 185 190 

AGA ACT CAC CAC GAG AAC CAT GGT CAT GTC GAG AAT GAC GAA TCA TGG 6 
Arg Thr His His Gin Asn His Gly His Val Glu Asn Asp Glu Ser Trp 
195 200 205 

CAT CCT TTG CCT GAA AGC ATC TAC AAG AAT TTG GAA AAG ACG ACT CAA 7 
His Pro Leu Pro Glu Ser lie Tyr Lys Asn Leu Glu Lys Thr Thr Gin 
210 215 220 

ATG TTT AGG TTT ACA CTG CCT TTT CCA ATG CTC GCA TAC CCT TTC TAC 7 
Met Phe Arg Phe Thr Leu Pro Phe Pro Met Leu Ala Tyr Pro Phe Tyr 
225 230 235 240 

TTG TGG AAC AGA AGT CCA GGG AAA CAA GGT TCT CAT TAT CAT CCG GAC 8 
Leu Trp Asn Arg Ser Pro Gly Lys Gin Gly Ser His Tyr Hia Pro Asp 
245 250 255 

AGT GAC TTG TTT CTT CCA AAA GAG AAG AAA GAT GTT CTG ACA TCA ACT 8 
Ser Asp Leu Phe Leu Pro Lys Glu Lys Lye Asp Val Leu Thr Ser Thr 
260 265 270 

GCC TGT TCG ACT GCA ATG GCT GCT TTG CTT GTT TGT CTC AAC TTT GTC 9 
Ala Cys Trp Thr Ala Met Ala Ala Leu Leu Val Cys Leu Asn Phe Val 
275 280 285 

ATG GGT CCA ATC CAG ATG CTC AAA CTA TAT GGC ATC CCT TAT TGG ATA 9 
Met Gly Pro lie Gin Met Leu Lys Leu Tyr Gly lie Pro Tyr Trp lie 
290 295 300 
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TTT GTA ATG TGG TTG GAC TTC GTC ACT TAC TTG CAC CAC CAT GGA CAT 1020 
Phe Val Met Trp Leu Asp Phe Val Thr Tyr Leu His Bis His Gly His 
305 310 315 320 

GAA GAC AAO CTC CCT TGG TAT OCT GGA AAG GAA TGG AGT TAC CTG AGA 1068 
Glu Asp Lys Leu Pro Trp Tyr Arg Gly Lys Glu Trp Ser Tyr Leu Arg 
325 330 335 

GGA GGG CTC ACA ACA TTA GAT CGT GAC TAC GGA TGG ATC AAT AAC ATC 1116 
Gly Gly Leu Thr Thr Leu Asp Arg Asp Tyr Gly Trp lie Asn Asn lie 
340 345 350 

CAC CAC GAT ATT GGA ACT CAT GTC ATA CAT CAT CTT TTC COG GAG ATC 1164 
His His Asp lie Gly Thr His Val lie His His Leu Phe Pro Gin lie 
355 360 365 

CCA CAT TAT CAT CTA GTA GAA GCA ACA GAA GCA GCT AAA CCA GTA CTA 1212 
Pro His Tyr His Leu Val Glu Ala Thr Glu Ala Ala Lys Pro Val Leu 
370 375 380 

GGA AAG TAC TAC AGA GAA CCG AAA AAC TCT GGA CCT CTG CCA CTT CAC 1260 
Gly Lys Tyr Tyr Arg Glu Pro Lys Asn Ser Gly Pro Leu Pro Leu His 
385 390 395 400 

TTA CTG GGA AGC CTC ATA AAG AGT ATG AAA CAA GAC CAT TTC GTA AGC 1308 
Leu Leu Gly Ser Leu He Lys Ser Met Lys Gin Asp His Phe Val Ser 
405 410 415 

GAT ACA GGA GAT GTC GTG TAC TAT GAG GCA GAT CCA AAA CTC AAT GGA 1356 
Asp Thr Gly Asp Val Val Tyr Tyr Glu Ala Asp Pro Lys Leu Asn Gly 
420 425 430 

CAA AGA ACA T6AGGACATA CTGCAGTGAA CCAGGCAGAC AAGTTACATA 1405 
Gin Arg Thr 
435 

AATTCATCTT GGCCCATTCA TTATGTTCTT TTTGTTTTGG TGTAAAGCCT TTTCGAGATT 1465 
AAAAAAGCAT TAATTTGTAG AAACCTGTGG TAAAACTCTC GATCAAATGA AATAAGATAT 1525 



(2) INFORMATION FOR SEQ 10 NO: 12: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 435 amino acids 

(B) TYPE: amino acid 
<D) TOPOLOGY: linear 

(11) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met Ala Ser Ser Val Leu Ser Glu Cys Gly Phe Arg Pro Leu Pro Arg 
15 10 15 
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Phe Tyr Pro Lys His Tlxr Thr Ser Phe Ala Ser Aen Pr Lys Pr Thr 
20 25 30 

Phe Lye Phe Asn Pro Pro Leu Lye Pro Pro Ser Ser Leu Leu Asn Ser 
35 40 45 

Arg Tyr Cly Phe Tyr Ser Lye Thr Arg Aan Trp Ala Leu Asn Val Ala 
50 55 60 

Thr Pro Leu Thr Thr Leu Gin Ser Pro Ser Glu Glu Aep Thr Glu Arg 
65 70 75 80 

Phe Asp Pro Gly Ala Pro Pro Pro Phe Asn Leu Ala Asp lie Arg Ala 
85 90 95 

Ala lie Pro Lys His Cys Trp Val Lys Asn Pro Trp Met Ser Met Ser 
100 105 110 

Tyr Val Val Arg Asp Val Ala lie Val Phe Gly Leu Ala Ala Val Ala 

115 120 125 

Ala Tyr Phe Asn Aan Trp Leu Leu Trp Pro Leu Tyr Trp Phe Ala Gin 
130 135 140 

Gly Thr Met Phe Trp Ala Leu Phe Val Leu Gly Bis Asp Cys Gly His 
145 150 155 160 

Gly Ser Phe Ser Asn Aep Pro Arg Leu Asn Ser Val Ala Gly His Leu 
165 170 175 

Leu Hie Ser Ser lie Leu Val Pro Tyr His Gly Trp Arg lie Ser His 
180 185 190 

Arg Thr His His Gin Asn His Gly His Val Glu Asn Asp Glu Ser Trp 
195 200 205 

Hie Pro Leu Pro Glu Ser ' lie Tyr Lys Asn I«eu Glu Lys Thr Thr Gin 
210 215 220 

Met Phe Arg Phe Thr Leu Pro Phe Pro Met I.eu Ala Tyr Pro Phe Tyr 
225 230 235 240 

I«eu Trp Aen Arg Ser Pro Gly Lys Gin Gly Ser His Tyr His Pro Asp 
245 250 255 

Ser Asp Leu Phe Leu Pro Lys Glu Lys Lys Asp Val Leu Thr Ser Thr 
260 265 270 

Ala Cys Trp Thr Ala Met Ala Ala Leu Leu Val Cys Leu Asn Phe Val 
275 280 285 

Met Gly Pro lie Gin Met Leu Lys Leu Tyr Gly lie Pro Tyr Trp lie 
290 295 300 

Phe Val Met Trp Leu Asp Phe Val Thr Tyr Leu His His His Gly His 
305 310 315 320 
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Glu Aep X-ye liOu Pro Trp Tyr Arg Gly Lye Glu Trp Ser Tyr Iiou Arg 
325 330 335 

Gly Gly Lttu Thr Thr Leu Asp Arg Asp Tyr Gly Trp lie Asn Asn lie 
340 345 350 

His Bis Asp lie Gly Thr His Val He Bis Bis Xisu Phe Pro Gin lie 
355 360 365 

Pro Bis Tyr Bis Leu Val Glu Ala Thr Glu Ala Ala Lys Pro Val Leu 
370 375 380 

Gly Lys Tyr Tyr Arg Glu Pro Lys Asn Ser Gly Pro Leu Pro Leu Bis 
385 390 395 400 

Leu Leu Gly Ser Leu He Lys Ser Met Lys Gin Asp Bis Phe Val Ser 
405 410 415 

Asp Thr Gly Asp Val Val Tyr Tyr Glu Ala Asp Pro Lys Leu Asn Gly 
420 425 430 

Gin Arg Thr 
435 



(2) INFORMATION FOR SEQ ID NO: 13s 

(i) SEQUENCE CBARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESSs Single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (synthetic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
GAYATBM6NG CNGCNATNCC 
(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTB: 17 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (synthetic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
GCNATHCCNA ARCAYTG 
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(2) INFORMATION FOR SEQ ID NO: 15s 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: single 

(D) TOPOLOGY t linear 

(11) MOLECULE TYPE: DNA (synthetic) 



(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
AARCAYTGYT GGGTNAA 
(2) INFORMATION FOR SEQ ID NOsl6t 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPES nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA (synthetic) 



(Xl) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
TGGTTYYTNT G6CCNYTNTA YTGG 24 
(2) INFORMATION FOR SEQ ID NO: 17: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA (synthetic) 



(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
TGGTTYYTNT GGCCN 

(2) INFORMATION FOR SEQ ID NO: 18: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(11) MOIiECUL£ TyPEs DNA (syn^h tic) 



(xl) SEQUENCE DESCStlPTXON: SEQ ID NO: 18: 
TGCCCNYTNT AYTGG 

(2) 2NFORHATZON FOR SEQ 2D NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base paire 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (synthetic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
TGG6TNGCNC ARG6NAC 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (synthetic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
TTYGTNYTNG 6NCAY6A 

(2) INFORMATION FOR SEQ ID NO:21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (synthetic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
GTNYTNGGNC AY6AYTG 
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(2) XKFORM21TION FOR SEQ ZD NO: 22: 

(1) SBQUENCE CHARACTERISTICS t 

(A) IJ5N6TH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESSs single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPES DNA (synthetic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NOs22i 
GGNCAYGAYT 6YGGNCA 
(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (synthetic) 



(xi) SBQUENCE DESCRIPTION: SEQ ID NO: 23: 
TGYGGNCAYG GNWSNTT 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (synthetic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 
CCNTAYCAY6 GNTGG 

(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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<li) MOLECULE TVPK: DNA ( synthetic) 

(xi) SEQUENCE DESCRZPTZON: SEQ ID llOs25: 
CI^YGGNTGGM GNATHWSNCA 20 
(2) INFORMATION FOR SEQ ID N08 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: UNA (synthetic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 
TGGHGNATHT CNCAYMGNAC NCAYCA 26 
(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: ONA (synthetic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
TGGNGNATHA GYCAYMGNAC NCAYCA 26 
(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (synthetic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
TGGMGNATHW SNCAY 15 
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(2) INFORK&TZON FOR SEQ ID NOs29: 

(1) SEQUENCE CHARACTERISTICS s 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (aynthetiic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
CAYMGNACNC AYCAY 

(2) INFORMATION FOR SEQ ZD NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (synthetic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
GARAAYGAYG ARWSNTGG 
(2) INFORMATI(»« FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (synthetic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31:. 
GAYGARWSNT GGGTNCC 
(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(11) MOLBCOU TYPE: DMA ( synthetic) 



(xl) SEQUENCE DESCRZPTZON: SEQ ZD NO:32t 
N6TNACNGCR TCNARCCA 
(2) INFORMATION FOR SEQ ID NO: 33: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOI^OGY: linear 

(11) MOLBCUXiE TYPE: ONA (synthetic) 



(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
RT6RTGNARR TANGT 

(2) INFORMATION FOR SEQ ID NO: 34: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA (synthetic) 



(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 
ARNCCNCCNC XNARRTARCT CCA 
(2) INFORMATION FOR SEQ ID NO:35: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA (synthetic) 



(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
ARNCCNCCNC KNARRTANGA CCA 
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(2) ZNFORKIiTION FOR SSQ ZD NO: 36s 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGy: linear 

(ii) HOLECULE TYPES DNA (synthetic) 



(xi) SEQUENCE DESCRIPTIONS SEQ ID NO: 36$ 
RTCNCKRTCD ATN6TNGTNA 20 
(2) INFORMATION FOR SEQ ID NOs37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS s single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (synthetic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:37: 
RTARTCNCKR TCDATNGT 18 
(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (synthetic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 
RTGN6TNCCD ATRTCRT6 18 
(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) .STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA {synthetic) 

(Xi) SEQUENCE DESCRIPTIONS SEQ ID NO s 39s 
NARRT6RTGD ATNACRTG IB 
(2) INFORM&TIC»l FOR SEQ ID NOs408 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (synthetic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO s 40s 

DATYTGNGGR AANARRTGRT G ^1 

(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 20 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (synthetic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 
GGDATYT6NG GRAANARRTG 20 
(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) Z«ENGTH: 23 base pairs 

(B) . TYPE: nucleic acid 

(C ) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (synthetic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 
RTARTGNGGD ATYTGNG6RA ANA 23 
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(2) INFORMATION FOR SEQ ID NO: 43s 

(1) SEQUENCE CKRRACTERISTICS : 

(A) LENGTHS 7 amino acids 

(B) TYPES amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPEs peptide 



<xi) SEQUENCE DESCRIPTION; SEQ ID NOs43s 

Asp lie Arg Ala Ala He Pro 

1 S 

(2) INFORMATION FOR SEQ ID NO: 44s 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPES amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: p^ide 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

Ala He Pro Lys His Cys 
1 5 

(2) INFORMATION FOR SEQ ID NOs45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

Lys His Cys Trp Val Lys 
1 5 

(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) U5NGTH: 8 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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(Xi) SEQUENCE DESCRIPTZON s SEQ ID NO: 46: 

Trp Phe hrnix Trp Pro heu Tyr Trp 
1 5 

(2) INFORMATION FOR SEQ ID NO: 47: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptiide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 

Trp Phe Leu Trp Pro 
1 5 

(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 

Trp Pro Leu Tyr Trp 
1 5 

(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 

Trp Val Ala Gin Gly Thr 
1 5 

(2) INFORMATION FOR SEQ ID NO: 50: 
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(i) SEQUENCE CHARACTERISTICS s 

(A) IAN6TH: 6 amino acids 

(B) TYPES amino acid 
(D) TOPCniOGY: linear 

(11) MOUCULE TYPE: peptide 



(xl) SEQUEHCE DESCRIPTION: SEQ ID NO: 50: 

Trp Val Ala Gin Gly Thr 

1 5 

(2) INFORMATION FOR SEQ ID NO:51: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: peptide 



(xl) SEQUENCE DESCRIPTION: SEQ ID NO:51: 

Val lieu Gly Bis Asp Cys 
1 5 

(2) INFORMATION FOR SEQ ID NO: 52: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: peptide 



(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

Gly His Asp Cys Gly His 
a 5 

(2) INFORMATION FOR SEQ ID NO: 53: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: peptide 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO$53: 

Cys 6Iy Hie Gly Ser Phe 
1 5 

(2) INFOHK21TION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS X 
(A) LENGTH: 5 amino acids 
' (B) TVPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

Pro Tyr His Gly Trp 
1 5 

(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 

His Gly Trp Arg lie Ser His 
1 5 

(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 
<A) LENGTH: 9 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 

Trp Arg lie Ser His Arg Thr His His 

1 5 

(2) INFORMATION F R SEQ ID NO: 57: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LSN6TH: 5 amino aclde 

(B) TYPE: ami.no acid 
(D) TOPOLOGY s linear 

(11) MOLECULE TYPES peptide 



(xl) SEQUENCE DESCRIPTION: SEQ ID NOs57t 

Trp Arg lie Ser His 
1 5 

(2) INFORMATION FOR SEQ ID NOs58s 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 
(O) TOPOLOGY: linear 

(11) MOLECULE TYPE: peptide 



(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 

His Arg Thr His His 
1 5 

(2) INFORMATION FOR SEQ ID NO: 59: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: peptide 



(xl) SEQUENCE DESCRIPTION: SEQ ID NO:59t 

Glu Asn Asp Glu Ser Trp 

1 5 

(2) INFORMATION FOR SEQ ID NO: 60: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: peptide 
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(xi) SEQOENCE DESCRIPTION; SEQ ID NOs60: 

Asp GXu ser Trp Val Pro 
1 5 

(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 

Trp Leu Asp Ala Val Thr 
1 S 

(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPES amino acid 
(D) TOPOLOGY: linear 

<ii> MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

Thr Tyr Leu His His 
1 5 

(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 

Trp Ser Tyr Leu Arg Gly Gly Leu 
1 5 

(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 
(O) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 

Leu Thr Thr lie Asp Arg Aep 
1 5 

(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6S: 

Thr lie Asp Arg Asp Tyr 
I 5 

(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 

His Asp lie Gly Thr His 

1 5 

(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 

His Val lie Bis His Leu 
1 5 

(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:68: 

His His Leu Phe Pro Gin lie 
1 5 

(2) INFORMATION FOR SEQ ID NOl69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 

His Leu Phe Pro Gin lie Pro 
1 5 

(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 
<A) LENGTH: 8 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 

Leu Phe Pro Gin lie Pro His Tyr 
1 5 

(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 
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(A) liSNGTRs 1670 base palre 

(B) TYPES nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(dLx) FEATtmB: 

(A) KAME/KEY: CDS 

(B) LOCATION: 46. .1302 

(xi) SEQUENCE DESCRIPTIONS SBQ ID NO: 71: 

CAAACTCTCT CGCGGGGTCG CTTCTTCTGC ATTTTCTGCT TCCCA ATG OCT TCC 54 

Met Ala ser 

1 

AGA ATT GCT GAT TCT CTC TTC GCC TTC ACG GGC CCA CAG CAA TGT CTT 102 
Arg He Ala Asp Ser Leu Phe Ala Phe Thr Gly Pro Gin Gin Cys Leu 
5 10 15 

CCT AGG GTT OCT AA6 CTT GCT GCT TCT TCT GCT C6T GTT TCT CCT GCT 150 
Pro Arg Val Pro Lys Leu Ala Ala Ser Ser Ala Arg Val Ser Pro Gly 
20 25 30 35 

GTA TAT GCT GTG AAG CCG ATT GAT CTT CT6 TTA AAA GGA CGA ACT CAT 198 
val Tyr Ala Val Lye Pro He Asp Leu Leu Leu Lys Gly Arg Thr His 
40 45 50 

CGA AGT AGA AGA TGT GTA GCT CCT GTG AAA AGG AGA ATT GGA TGT ATC 246 
Arg Ser Arg Arg Cys Val Ala Pro Val Lys Arg Arg He Gly Cys He 
55 60 65 

AAA GCG GTG GCT GCT CCA GTT GCA CCG CCT TCA GCT GAC AGT GCA GAA 294 
Lys Ala Val Ala Ala Pro Val Ala Pro Pro Ser Ala Asp Ser Ala Glu 
70 75 BO 

GAC AGG GAA CAG TTA GCA GAA AGC TAT GGA TTC AGA CAA ATT GGA GAA 342 
Asp Arg Glu Gin Leu Ala Glu Ser Tyr Gly Phe Arg Gin He Gly Glu 
85 90 95 

GAT CTT CCT GAG AAT GTC ACC TTA AAA GAT ATC ATG GAT ACA CTT CCC 390 
Asp Leu Pro Glu Asn Val Thr Leu Lys Asp He Met Asp Thr Leu Pro 
100 105 110 115 

AAA GAG GTG TTT GAG ATT GAT GAT GTG AAA GCT TTG AAG TCT GTG TTG 438 
Lys Glu Val Phe Glu He Asp Asp Leu Lys Ala Leu Lys Ser Val Leu 
120 125 130 

ATA TCT GTG ACT TCA TAC ACT TTG GGG CTC TTC ATG ATT GCA AAA TCG 486 
He Ser Val Thr Ser Tyr Thr Leu Gly Leu Phe Met He Ala Lys Ser 
135 140 145 

CCG TGG TAT CTG CTA CCG TTG GCT TGG GCA TGG ACA GGA ACT GCA ATT 534 
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Pro Trp Tyr I-eu I.eu Pro I*ou Ala Trp Ala Trp Thr Gly Thr Ala He 
150 155 160 

ACC 6G6 TTC TTT CTG ATA GGT CAT GAT TGT GCA CAT AAG TCA TTT TCA 582 
Thr Gly Phe Phe Val He Gly His Asp Cye Ala His Lys Ser Phe Ser 

a65 170 175 



AAG AAC AAA TTG 6TG GAA GAG ATT GTG GGT ACT CTC GCC TTC CTA CCA 
Lys Asn Lys Leu Val Glu Asp He Val Gly Thr Leu Ala Phe Leu Pro 
180 185 190 195 



630 



CTT GTC TAC CCA TAT GAG CCA TGG CGC TTT AAG CAC GAC CGC CAT CAC 678 
Leu Val Tyr Pro Tyr Glu Pro Trp Arg Phe Lys His Asp Arg His His 
200 205 210 

GCC AAA ACC AAC ATG TTA CTT CAT GAC ACA GCT TGG CAG CCA GTT CCG 726 
Ala Lys Thr Asn Met Leu Leu His Asp Thr Ala Trp Gin Pro Val Pro 
215 220 225 

CCA GAG GAG TTT GAG TCA TCA CCC GTG ATG AGA AAG GCA ATC ATT TTT 774 
Pro Glu Glu Phe Glu Ser Ser Pro Val Met Arg Lys Ala He He Phe 
230 235 240 

GGA TAT GGC CCA ATT AGA CCT TGG TTG TCC ATA GCT CAC TGG GTG AAC 822 
Gly Tyr Gly Pro He Arg Pro Trp Leu Ser He Ala His Trp Val Asn 
245 250 255 

TGG CAC TTC AAT CTG AAA AAG TTC AGA GOG AGC GAG GTG AAT AGG GTG 870 
Trp His Phe Asn Leu Lys Lys Phe Arg Ala Ser Glu Val Asn Arg Val 

260 265 270 275 

AAG ATA AGT TTG GCT TGT GTT TTC GCC TTC ATG GCC GTT GGG TGG CCA 918 
Lys He Ser Leu Ala Cys Val Phe Ala Phe Met Ala Val Gly Trp Pro 
280 285 290 

CTG ATC GTA TAC AAA GTT GGT ATA TTG GGA TGG GTA AAA TTC TGG TTA 966 
Leu He Val Tyr Lys Val Gly He Leu Gly Trp Val Lys Phe Trp Leu 
295 300 305 

ATG CCA TGG TTG GGC TAT CAC TTC TGG ATG AGC ACA TTC ACA ATG GTT 1014 
Met Pro Trp Leu Gly Tyr His Phe Trp Met Ser Thr Phe Thr Met Val 
310 315 320 

CAT CAT ACG GCT CCG CAT ATA CCT TTC AAG CCT GCG GAT GAG TGG AAC 1062 
His His Thr Ala Pro His He Pro Phe Lys Pro Ala Asp Glu Trp Asn 
325 330 335 

GCG GCT CAG GCC CAG CTG AAT GGA ACT GTT CAT TGT GAC TAC CCT AGT 1110 
Ala Ala Gin Ala Gin Leu Asn Gly Thr Val Hie Cys Asp Tyr Pro Ser 
340 345 350 355 

TGG ATT GAA ATT CTC TGC CAT GAT ATC AAC GTT CAC ATC CCG CAT CAT 1158 
Trp He Glu He Leu Cys His Asp He Asn Val His He Pro His His 
360 365 370 

ATT AGC CCA AGA ATA CCG AGC TAC AAT CTC CGT GCA GCT CAT GAG TCT 1206 
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Zle Ser Pro Arg Xle Pro Ser Tyr Asn X»eu Arg Ala Ala Hie Glu Ser 
375 380 385 

ATA CAA GAG AAC TGG GGA AAG TAT ACA AAC TT6 GCT ACA TGG AAC TGG 1254 
lie Gin Glu Asn Trp Gly Lys Tyr Thr Asn Leu Ala Thr Trp Asn Trp 
390 395 400 

C6A TTG ATG AAG ACG ATA ATG ACT GTG T6T GAT GTC TAT GAC AAA TAGGAGAACT 
1309 

Arg Leu Met Lye Thr Xle Met Thr Val Cys His Val Tyr Asp Lye 
405 410 415 

ACATTCCTTT TGACCGGTTA 6CCCCTGAAG AATCTCAGCC AATAACCTTC CTCAAGAAAT 1369 

CAATGCCTAA CTACACAGCC TGATTCGCCA TGGTCTCAAA CTAGTCTTTT GAAATCTCAA 1429 

TATCTTTTTG CAGTC600GA TGTTATAT6T AA6CTTTCCA AGC6ATGAGC TTCTCTAACA 1489 

CTTCACCAAC GCTT T ATACT GTTATCTTCT TTCCAATCTT ATCAGAAGAG AGAAACTGGT 1549 

CAAATTATCT GAGCGATT6C AATTCTTTTA TCAGTTTCTT AGCTATAAGA AGATTGAACA 1609 

GTCTATATAG TTTGCAATGT ACT6TAATGT GATGAAAATT TAGTTOATGA 6AAAAAAAAA 1669 

A 1670 

(2) XNFORMATXON FOR SEQ XD NO: 72: 

(i) SEQUENCE CHARACTERXSTXCS : 

(A) LENGTH: 418 amino acids 

(B) TYPE: amino acid 
(O) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 

Met Ala Ser Arg lie Ala Asp Ser Leu Phe Ala Phe Thr Gly Pro Gin 
15 10 15 

Gin Cys Leu Pro Arg Val Pro Lys Leu Ala Ala Ser Ser Ala Arg Val 
20 25 30 

Ser Pro Gly Val Tyr Ala Val Lys Pro Xle Asp Leu Leu Leu Lys Gly 
35 40 45 

Arg Thr His Arg Ser Arg Arg Cys Val Ala Pro Val Lys Arg Arg Xle 
50 55 60 

Gly Cys Xle Lys Ala Val Ala Ala Pro Val Ala Pro Pro Ser Ala Asp 
65 70 75 80 

Ser Ala Glu Asp Arg Glu Gin Leu Ala Glu Ser Tyr Gly Phe Arg Gin 
85 90 95 



SUBSTITUTE SHEET (RULE 26) 



wo 94/18337 



PCT/US94/01321 



-103- 

Ile Gly Glu Aap I*eu Pro Glu Asn Val Thr Leu Vye Asp lie Met Asp 
100 105 110 

Thr Leu Pro Lye Glu Val Phe Glu He Asp Asp Leu Lys Ala Leu Lys 
115 120 125 

Ser val Leu He Ser Val Thr Ser Tyr Thr Leu Gly Leu Phe Met He 
130 135 140 

Ala Lys Ser Pro Trp Tyr Leu Leu Pro Leu Ala Trp Ala Trp Thr Gly 
145 150 155 160 

Thr Ala He Thr Gly Phe Phe Val He Gly His Asp Cys Ala His Lys 
165 170 175 

Ser Phe Ser Lys Asn Lys Leu Val Glu Asp He Val Gly Thr Leu Ala 
180 165 190 

Phe Leu Pro Leu Val Tyr Pro Tyr Glu Pro Trp Arg Phe Lys His Asp 
195 200 205 

Arg His His Ala Lys Thr Asn Met Leu Leu His Asp Thr Ala Trp Gin 

210 215 220 

Pro Val Pro Pro Glu Glu Phe Glu Ser Ser Pro Val Met Arg Lye Ala 
225 230 235 240 

He He Phe Gly Tyr Gly Pro He Arg Pro Trp Leu Ser He Ala His 
245 250 255 

Trp Val Asn Trp His Phe Asn Leu Lys Lys Phe Arg Ala Ser Glu Val 
260 265 270 

Asn Arg Val Lys He Ser Leu Ala Cys Val Phe Ala Phe Mot Ala Val 

275 280 285 

Gly Trp Pro Leu He Val Tyr Lys Val Gly He Leu Gly Trp Val Lys 
290 295 300 

Phe Trp Leu Met Pro Trp Leu Gly Tyr His Phe Trp Met Ser Thr Phe 
305 310 315 320 

Thr Met Val His His Thr Ala Pro His He Pro Phe Lys Pro Ala Asp 
325 330 335 

Glu Trp Asn Ala Ala Gin Ala Gin Leu Asn Gly Thr Val His Cys Asp 
340 345 350 

Tyr Pro Ser Trp He Glu He Leu Cys His Asp He Asn Val His lie 
355 360 365 

Pro His His He Ser Pro Arg He Pro Ser Tyr Asn Leu Arg Ala Ala 
370 375 380 

His Glu Ser lie Gin Glu Asn Trp Gly Lys Tyr Thr Asn Leu Ala Thr 
385 390 395 400 
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Trp Asn Trp Arg Leu Met Lye Thr He Met Thr val Cys His Val Tyr 
405 410 415 

Asp Lys 



SUBSTITUTE SHEET (RULE 26) 



wo SM/18337 



PCT/US94/01321 



-105^ 



Claims: 

1, A genetically transformed plant which has an elevated 
linolenic acid content comprising a recombinant, double-stranded DNA 

5 molecule comprising 

(i) a promoter that functions in plant cells to cause 
the production of an UNA sequence, said promoter 
operably linked to; 

(ii) a structural coding sequence that causes the 
10 production of an RNA sequence that encodes a Unoleic 

acid desaturase activity; and 

(iii) a 3' non-translated region that functions in plant 
cells to promote polyadenylation to the 3' end of said RNA 
sequence. 

15 2. The plant of claim 1 in which the Unoleic acid desaturase 

activity is from plants. 

3. The plant of claim 1 in which the Unoleic add desatiu^ase 
activity is from fimgi, algae or bacteria. 

4. The plant of claim 1 in which the structural coding 
20 sequence of (ii) is taken from SEQ. ID NO:l. 

5. The plant of claim 1 in which the structural coding 
sequence of (ii) is taken from SEQ. ID NO:9. 

6. The plant of claim 1 in which the structural coding 
sequence of (ii) is taken from SEQ. ID NO: 11. 

25 7. The plant of claim 1 in which the promoter of (i) is an 

endogenous plant Unoleic acid desaturase promoter. 

8. A genetically transformed plant which has a reduced 
hnolenic acid content, comprising a recombinant, double-stranded DNA 
moleciile comprising 
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(i) a promoter that functions in plant cells to cause 
the production of an RNA sequence, said promoter 
operably linked to; 

(ii) a DNA sequence that causes the production of an 
5 RNA sequence that is in antisense orientation to at least 

a portion of a gene that encodes a linoleic add desaturase 
activity in said plant; and 

(iii) a 3' non-translated region that functions in plant 
cells to promote polyadenylation to the 3' end of said RNA 

10 sequence. 

9. The plant of claim 8 in which the linoleic acid desaturase 
enzyme is from plants. 

10. The plant of claim 8 in which the linoleic acid desaturase 
enssyme is from fungi, algae or bacteria. 

15 11. The plant of claim 8 in which the structural coding 

sequence of (ii) is taken from SEQ. ID NO:l. 

12. The plant of claim 8 in which the structural coding 
sequence of (ii) is taken from SEQ. ID NO:9. 

13. The plant of claim 8 in which the structural coding 
20 sequence of (ii) is taken from SEQ. 8 ID NO:ll. 

14. The plant of claim 8 in which the promoter of (i) is an 
endogenous plant linoleic add desatxirase promoter. 

15. A genetically transformed plant which has an improved 
resistance to low temperatures comprising a recombinant, double-stranded 

2 5 DNA molecule comprising 

(i) a promoter that functions in plant cells to cause 
the production of an RNA sequence, said promoter 
operably linked to; 
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(ii) a structural coding sequence that causes the 
production of an RNA sequence that encodes a liholeic 
add desaturase activity; and 

(iii) a 3' non-translated region that functions in plant 
5 cells to promote polyadenylation to the 3' end of said RNA 

sequence. 

16. A genetically transformed plant which has an elevated 
ability to respond to pathogens, comprising a recombinant, double-stranded 
DNA molecule comprising 

10 (i) a promoter that functions in plant cells to cause 

the production of an RNA sequence, said promoter 
operably linked to; 

(ii) a structural coding sequence that causes the 
production of an RNA sequence that encodes a linoleic 
15 add desaturase activity; and 

(lii) a 3* non-translated region that functions in plant 
cells to promote polyadenylation to the 3' end of said RNA 
sequence. 

17. A seed produced from genetically transformed plant where 
20 said seed has an linolenic acid content suitable for use as a source of 

linolenic add, said plant comprising a recombinant, double-stranded DNA 
molecule comprising 

(i) a promoter that fimctions in plant cells to cause 
the production of an RNA sequence, said promoter 

25 operably linked to; 

(ii) a structural coding sequence that causes the 
production of an RNA sequence that encodes a linoleic 
acid desaturase activity; and 
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(iii) a 3' non-translated region that functions in plant 
cells to promote polyadenylation to the 3' end of said RNA 
sequence. 

IS.The seed of claim 17 where said plant is selected from the 
5 group consisting of soybean and rapeseed. 

19. A genetically transformed plant which has a linolenic add 
content of less than about 3%, said plant comprising a recombinant, 
double-stranded DNA molecule comprising 

(i) a promoter that functions in plant cells to cause 
10 the production of an RNA sequence, said promoter 

operably linked to; 

(ii) a DNA sequence that causes the production of an 
RNA sequence that is in antisense orientation to at least 
a portion of a gene that encodes a linoleic acid desaturase 

15 activity in said plant; and 

(iii) a 3* non-translated region that functions in plant 
cells to promote polyadenylation to the 3' end of said RNA 
sequence. 

20. A genetically transformed plant which has an increased 
20 oleic acid content, comprising a recombinant, double-stranded DNA 

molecule comprising 

(i) a promoter that functions in plant cells to cause 
the production of an RNA sequence, said promoter 
operably linked to; 

25 (ii) a DNA sequence that causes the production of an 

RNA sequence that is in antisense orientation to at least 
a portion of a gene that encodes a oleic acid desaturase 
activity in said plant; and 
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(iii) a 3' non-translated region that functions in plant 
cells to promote polyadenylation to the 3* end of said RNA 
sequence. 

21. A genetically transformed plant which has an increased 
6 oleic acid content, comprising a recombinant, double-stranded DNA 

molecule comprising 

(i) a promoter that functions in plant cells to catise 
the production of an RNA sequence, said promoter 
operably linked to; 

10 (ii) a DNA sequence that causes the production of an 

KNA sequence that is in antisense orientation to at least 
a portion of a gene that encodes a linoleic add desaturase 
activity in said plant; and 

(iii) a 3' non-translated region that functions in plant 
15 cells to promote polyadenylation to the 3* end of said RNA 

sequence. 

22. A method of producing a genetically transformed plant 
which has an elevated linolenic acid content, comprising 

(a) inserting into the genome of a plant cell a 
20 recombinant, double-stranded DNA molecule comprising: 

(i) a promoter that functions in plant cells to 
. cause the production of an RNA sequence, said 

promoter operably linked to; 

(ii) a structiu-al coding sequence that causes 
25 the production of an RNA sequence that encodes 

a linoleic acid desaturase activity; and 

(iii) a 3* non-translated region that functions in 
plant cells to promote polyadenylation to the 3' 
end of said RNA sequence; 
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(b) obtaining transformed plant cells; and 

(c) regenerating from the transformed plant cells 
genetically transformed plants which have an elevated 
linolenic acid content. 

5 23. The method of claim 22 in which the linoleic acid 

desatxirase enz3nne is from plants. 

24. The method of claim 22 in which the linoleic acid 
desaturase enz3rme is from fungi, algae or bacteria. 

25. The method of claim 22 in which the structural coding 
1 0 sequence of (ii) is taken from SEQ. ID NO: 1. 

26. The method of claim 22 in which the structural coding 
sequence of (ii) is taken from SEQ. ID NO:9. 

27. The method of claim 22 in which the structural coding 
sequence of (ii) is taken from SEQ. ID NO:ll. 

15 28. The plant of claim 22 in which the promoter of (i) is an 

endogenous plant linoleic add desaturase promoter. 

29. A method of producing a genetically transformed plant 
which has a reduced linolenic add content, comprising 

(a) inserting into the genome of a plant cell a 
20 recombinant, double-stranded DNA molecule comprising: 

(i) a promoter that functions in plant cells to 
cause the production of an RNA sequence, said 
promoter operably linked to; 

(ii) a DNA sequence that causes the 
25 production of an RNA sequence that is in 

antisense orientation to at least a portion of a 
gene that encodes a linoleic acid desaturase 
activity in said plant; and 
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(iii) a 3' non-translated region that functions in 
plant cells to promote polyadenylation to the 3' 
end of said RNA sequence 
(b) obtaining transformed plant cells; and 
5 (c) regenerating from the transformed plant cells 

genetically transformed plants which have a reduced 
linolenic acid content. 
30. The method of claim 29 in which the linoleic acid 
desaturase enz3mae is from plants. 
10 31. The method of claim 29 in which the linoleic acid 

desaturase enssyme is from fungi, algae or bacteria. 

32. The method of claim 29 in which the structural coding 
sequence of (ii) is taken from SEQ. ID NO:l. 

33. The method of claim 29 in which the structural coding 
15 sequence of (ii) is taken from SEQ. ID NO:9. 

34. The method of claim 29 in which the structural coding 
sequence of (ii) is taken from SEQ. ID NO:ll. 

35. The plant of claim 29 in which the promoter of (i) is an 
endogenous plant linoleic acid desaturase promoter. 

20 36. A method of producing a genetically transformed plant 

which has an increased oleic acid content, comprisizig 

(a) inserting into the genome of a plant cell a 
recombinant, double-stranded DNA molecule comprising: 

(i) a promoter that functions in plant cells to 
25 cause the production of an RNA sequence, said 

promoter oi>erably Unked to; 

(ii) a DNA sequence that causes the 
production of an RNA sequence that is in 
antisense orientation to at least a portion of a 
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gene that encodes a linoleic acid desaturase 
activity in said plant; and 

(iii) a 3* non-translated region that functions in 
plant cells to promote polyadenylation to the 3* 
5 end of said BNA sequence 

(b) obtaining transformed plant cells; and 

(c) regenerating from the transformed plant cells 
genetically transformed plants which have an increased 
oleic add content. 

10 37. A recombinant, double-stranded DNA molecule 

comprising in sequence: 

(i) a promoter that functions in plant cells to cause 
the production of an RNA sequence, said promoter 
operably linked to; 

15 (ii) a structural coding sequence that causes the 

production of an RNA sequence that encodes a linoleic 
add desaturase activity; and 

(iii) a 3' non-translated region that functions in plant 
cells to promote polyadenylation to the 3' end of said RNA 
20 sequence. 

38. A recombinant, double-stranded DNA molecule 
comprising in sequence: 

(i) a promoter that functions in plant cells to cause 
the production of an RNA sequence, said promoter 

25 operably linked to; 

(ii) a DNA sequence that catises the production of an 
RNA sequence that is in antisense orientation to at least 
a portion of a gene that encodes a linoleic add desaturase 
activity in said plant; and 
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(iii) a 3* non-translated region that functions in plant 
cells to promote polyadenylation to the 3' end of said RNA 
sequence. 

39. A plant cell comprising a recombinant, double- 
5 stranded DNA molecule comprising in sequence: 

(i) a promoter that functions in plant cells to cause 
the production of an RNA sequence, said promoter 
operably linked to; 

(ii) a DNA sequence that causes the production of an 
10 RNA sequence that is in antisense orientation to at least 

a portion of a gene that encodes a linoleic add desaturase 
activity in said plant; and 
. (iii) a 3* non-translated region that functions in plant 
cells to promote polyadenylation to the 3' end of said RNA 
15 sequence. 

40. A method of producing a genetically transformed plant 
which has an increased oleic acid content, comprising 

(a) inserting into the genome of a plant cell a 
recombinant, double-stranded DNA molecule comprising: 
20 (i) a promoter that functions in plant cells to 

cause the production of an RNA sequence, said 

promoter operably linked to; 

(ii) a DNA sequence that causes the 
production of an RNA sequence that is in 
25 antisense orientation to at least a portion of a 

gene that encodes a oleic acid desaturase activity 
in said plant; and 
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(iii) a 3* non-translated region that functions in 
plant cells to promote polyadenylation to the 3' 
end of said KNA sequence 

(b) obtaining transformed plant cells; and 

(c) regenerating from the transformed plant cells 
genetically transformed plants which have an increased 
oleic add content. 
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AATCCATCAA ACCTTTATTC ACXXATTTC ACTGAAAGGC CACACATCTA GAGAGAGAAA 60 

CTTCGTCCAA ATCTCTCTCT CCAGCG ATG GTT GTT GOT ATG GAG GAG CGC AGC 113 

Mel Vol Vol Alo iitet Asp Gin Arg Ser 
1 5 

AAT GTT AAC GGA GAT TCC GGT GCC CGG AAG GAA GAA GGG TTT GAT CCA 161 
Asn Vol Asn Gly Asp Ser Gly Alo Arg Lys Glu Glu Gly Phe Asp Pro 
10 15 20 25 

AGC GGA CAA CCA CCG TTT AAG ATC GGA GAT ATA AGG GCG GCG ATT CCT 209 
Ser Alo Gin Pro Pro Phe Lys lie Gly Asp He Arg Alo Alo He Pro 
30 35 40 

AAG CAT TGC TGG GTG AAG AGT CCT TTG AGA TCT ATG AGC TAC GTC ACC 257 
Lvs His Cys Trp Vol Lys Ser Pro Leu Arg Ser Met Ser Tyr Vol Thr 
45 50 55 

AGA GAG ATT TTC GCC GTC GCG GGT CTG GCC ATG GCC GCC GTG TAT TTT 305 
Arg Asp He Phe Alo Vol Alo Alo Leu Alo Met Alo Alo Vol Tyr Phe 
60 65 70 

GAT AGC TGG TTC CTC TGG CCA GTC TAC TGG GTT GCC CAA GGA ACC CTT 353 
Asp Ser Trp Phe Leu Trp Pro Leu Tyr Trp Vol Alo Gin Gly Thr Leu 
75 80 85 

TTC TGG GCC ATC TTC GTT CTT GGC CAC GAG TGT GGA CAT GGG AGT TTC 401 
Phe Trp Alo He Phe Vol Leu Gly His Asp Cys Gly His Gly Ser Phe 
90 95 100 105 

TCA GAC ATT CCT CTG CTG AAC AGT GTG GTT GGT CAC ATT CTT CAT TCA 449 
Ser Asp He Pro Leu Leu Asn Ser Vol Vol Gly His He Leu His Ser 
110 115 120 

TTC ATC CTC GTT CCT TAC CAT GGT TGG AGA ATA AGC CAT CGG ACA CAC 497 
Phe He Leu Vol Pro Tyr His Gly Trp Arg He Ser His Arg Thr His 
125 130 135 

CAC GAG AAC CAT GGC CAT GTT GAA AAC GAC GAG TCT TGG GTT CCG TTG 545 
His Gin Asn His Gly His Vol Glu Asn Asp Glu Ser Trp Vol Pro Leu 
140 145 150 
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CCA GM AAG TTG TAG AAG AAC TTG CCC CAT AGT ACT CGG ATG CTC AGA 593 
Pro Glu Lys Leu Tyr Lys Asn Leu Pro His Ser Thr Arg Mel Leu Arg 
155 160 165 

TAG ACT GTC CCT CTG CCC ATG CTC GCT TAC COG ATC TAT CTG TGG TAC 641 
Tyr Thr Vol Pro Leu Pro Mel Leu Alo Tyr Pro He Tyr Leu Trp Tyr 
170 175 180 185 

AGA AGT CCT GGA AAA GAA GGG TCA CAT TTT AAC CCA TAC AGT AGT TTA 689 
Arg Ser Pro Gly Lys Glu Gly Ser His Phe Asn Pro Tyr Ser Ser Leu 
190 195 200 

TTT GCT CCA AGC GAG AGG AAG CTT ATT GCA ACT TCA ACT ACT TGC TGG 737 
Phe Ala Pro Ser Glu Arg Lys Leu lie Alo Thr Ser Thr Thr Cys Trp 
205 210 215 

TCC ATA ATG TTG GCC ACT CTT GH TAT CTA TCG TTC CTC GTT GAT CCA 785 
Ser lie Mel Leu Alo Thr Leu Vol Tyr Leu Ser Phe Leu Vol Asp Pro 
220 225 230 

GTC ACA GTT CTC AAA GTC TAT GGC GTT CCT TAC ATT ATC TTT GTG ATG 833 
Vol Thr Vol Leu Lys Vol Tyr Gly Vol Pro Tyr lie lie Phe Vol Met 
235 240 245 

TGG TTG GAG GCT GTC ACG TAC TTG CAT CAT CAT GGT CAC GAT GAG AAG 881 
Trp Leu Asp Alo Vol Thr Tyr Leu His His His Gly His Asp Glu Lys 
250 255 260 265 

TTG CCT TGG TAC AGA GGC AAG GAA TGG AGT TAT TTA CGT GGA GGA TTA 929 
Leu Pro Trp Tyr Arg Gly Lys Glu Trp Ser Tyr Leu Arg Gly Gly Leu 
270 275 280 

ACA ACT ATT GAT AGA GAT TAC GGA ATC TTC AAC AAC ATC CAT CAC GAC 977 
Thr Thr lie Asp Arg Asp Tyr Gly He Phe Asn Asn lie His His Asp 
285 290 295 

ATT GGA ACT CAC GTG ATC CAT CAT CTT TTC CCA CAA ATC CCT CAC TAT 1025 
He Gly Thr His Vol He His His Leu Phe Pro Gin He Pro His Tyr 
300 305 310 
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CAC TTG GTC GAT GCC ACG AGA OCA GCT AAA CAT GTG TTA GGA AGA TAC 1073 
His Leu Vol Asp Alo Thr Arg Ala Ala Lys His Vol Leu Gly Arg Tyr 
315 320 325 

TAC AGA GAG CCG AAG ACG TCA GGA GCA ATA CCG ATT CAC TTG GTG GAG 1121 
Tyr Arg Glu Pro Lys Thr Ser Gly Ala lie Pro He His Leu Vol Glu 
330 335 340 345 

ACT TTG GTC GCA AGT ATT AAA AAA GAT CAT TAC GTC AGT GAC ACT GGT 1169 
Ser Leu Vol Alo Ser lie Lys Lys Asp His Tyr Vol Ser Asp Thr Gly 
350 355 360 

GAT ATT GTC TTC TAC GAG ACA GAT CCA GAT CTC TAC GTT TAT GCT TCT 1217 
Asp He Vol Phe Tyr Glu Thr Asp Pro Asp Leu Tyr Vol Tyr Alo Ser 
365 370 375 

GAC AAA TCT AAA ATC AAT TAACTTTTCT TCCTAGCTCT ATTAGGAATA 1265 
Asp Lys Ser Lys He Asn 
380 

AACACTGCTT CTCTTTTACT TATTTGTTTC TGCTTTAAGT TTAAAATGTA CTCGTGAAAC 1325 
CTTTTTTTTA TTAATGTATT TACGTTAC 1353 
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FIG.3d 



Met Vol Vol Alo Met Asp Gin Arg Ser Asn Vol Asn Gly Asp Scr Gly 
15 10 15 

Alo Arq Lys Glu Glu Gly Phe Asp Pro Ser Alo Gin Pro Pro Phe Lys 
20 25 30 

lie Gly Asp lie Arg Alo Alo He Pro Lys His Cys Trp Vol Lys Ser 
35 40 45 

Pro Leu Arg Ser Met Ser Tyr Vol Thr Arg Asp He Phe Alo Vol Alo 
50 55 60 

Alo Leu Alo Met Alo Alo Vol Tyr Phe Asp Ser Trp Phe Leu Trp Pro 
65 70 75 80 

Leu Tyr Trp Vol Alo Gin Gly Thr Leu Phe Trp Alo He Phe Vol Leu 
85 90 95 

Gly His Asp Cys Gly His Gly Ser Phe Ser Asp He Pro Leu Leu Asn 
100 105 110 

Ser Vol Vol Gly His He Leu His Ser Phe He Leu Vol Pro Tyr His 
115 120 125 

Gly Trp Arg He Ser His Arg Thr His His Gin Asn His Gly His Vol 
130 135 140 

Glu Asn Asp Glu Ser Trp Vol Pro Leu Pro Glu Lys Leu Tyr Lys Asn 
145 150 155 160 

Leu Pro His Ser Thr Arg Met Leu Arg Tyr Thr Vol Pro Leu Pro Mel 
165 170 175 

Leu Alo Tyr Pro He Tyr Leu Trp Tyr Arg Ser Pro Gly Lys Glu Gly 
180 185 190 

Ser His Phe Asn Pro Tyr Ser Ser Leu Phe Alo Pro Ser Glu Arg Lys 
195 200 205 

Leu He Alo Thr Scr Thr Thr Cys Trp Ser He Mel Leu Alo Thr Leu 
210 215 220 

Vol Tyr Leu Ser Phe Leu Vol Asp Pro Vol Thr Vol Leu Lys Vol Tyr 
225 230 235 240 

Gly Vol Pro Tyr He He Phe Vol Met Trp Leu Asp Alo Vol Thr Tyr 
245 250 255 
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Leu His His His Gly His Asp Glu Lys Leu Pro Trp Tyr Arg Gly Lys 
260 265 270 

Glu Trp Ser Tyr Leu Arg Gly Gly Leu Thr Thr lie Asp Arg Asp Tyr 
275 280 285 

Gly He Phe Asn Asn lie His His Asp lie Gly Thr His Vol He His 
290 295 300 

His Leu Phe Pro Gin He Pro His Tyr His Leu Vol Asp Alo Thr Arg 
305 310 315 320 

Alo Alo Lys His Vol Leu Gly Arg Tyr Tyr Arg Glu Pro Lys Thr Ser 
325 330 335 

Gly Alo He Pro He His Leu Vol Glu Ser Leu Vol Alo Ser He Lys 
340 345 350 

Lys Asp His Tyr Vol Ser Asp Thr Gly Asp He Vol Phe Tyr Glu Thr 
355 360 365 

Asp Pro Asp Leu Tyr Vol Tyr Alo Ser Asp Lys Ser Lys He Asn 
370 375 380 
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10 20 30 40 50 60 

BND3.AMI RSNVNGDSGARKEEGFOPSAQPPFKIGDIRAAIPKHCWVKSPLRSMSYVTRDIFAVAALA 



DESA AMI MTATIPPLTPTVTPSNPDRPIADLKLQOIIKTLPKECFEKKASKAWASVLITLGAIAVGY 

10 20 30 40 50 60 

70 80 90 100 110 120 

BND3.AMI MAAVYFDSWFLVIIPLYWVAQGTLFWAIFVLGHDCGHGSFSDIPLLNSVVGHILHSFILVPY 

DESA AMI LGI iYL-PWYCLPiTwimGTALTGAFWGHOaHRS^ 

70 80 90 100 110 

130 140 150 160 170 180 

BND3.AMI HGWRISHRTHHQNHGHVENDESWVPLPEKLYKNLPHSTRMLRYTVPLPH-LAYPIYLWYR 

■ " " «.i.r 

DESA AMI HSWRLLI^DHHHLHTNKiEVDNAW^^ 

120 130 140 150 160 170 

190 200 210 220 230 240 

BND3.AMI SPGKEGSHFNPYSSLFAPSERKLIATSTTCWSIMLATLVYLSFLVDP-V-TVLKVYGVPY 

... . . • I . V; . . I . 

OESA AMI kMHFK— LSNFAQTORNKVKLsiAV-VFLFM i I TTGVVIGFVKFWLMPW 

180 190 200 210 220 230 

250 260 270 280 290 300 

BNDS.AMI I IFVMIW.DAVTYLmHGHDEKLPWYRGKEWSYLRGGL-TTIDRDYGIFNNIH-HDIGTHV 

DESA AMI LVYHFWMSTFTI^rt^HTIPEIRF-^?P^^»^ 

240 250 260 270 280 

310 320 330 340 350 360 

BND3 .AMI IHHLFPQIPHYHLVDATRAAKHVLGRYYREPKTSGAIPIHLVESLVASI KKDHYVSDTGD 

DESA AMI PliisVAiPSmRLAHGkKENV^PFLYE^^ 

290 300 310 320 330 340 

BND3.AMI IVF 

DESA. AMI KKV 
350 
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GGAAAACACA AGTTTCTCTC ACACACATTA TCTCTTTCTC TATTACCACC ACTCATTCAT 60 

AACAGAAACC CACXJAAAAAA TAAAAAGAGA GACTTTTCAC TCTGGGGAGA GAGCTCAAGT 120 

TCTA ATG GOG AAC TTG GTC TTA TCA GAA TGT GGT ATA CGA COT CTC CCC 169 
Met Alo Asn Leu Vol Leu Ser Glu Cys Gly He Arg Pro Leu Pro 
1 5 10 15 
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ATC TAG 


ACA 
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CCC 
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AAT 


TTC 


CTC 
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30 
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TGG CCT CTC TAT TGG CTC GCT CAA GGA ACC ATG TTT TGG GCT CTC TTT 601 
Trp Pro Leu Tyr Trp Leu Alo Gin Gly Thr Met Phe Trp Aio Leu Phe 
145 150 155 

GTT CTT GGT CAT GAC TGT GGA CAT GGT ACT TTC TCA AAT GAT CCG AAG 649 
Vol Leu Gly His Asp Cys Gly His Gly Ser Phe Ser Asn Asp Pro Lys 
160 165 170 175 

TTG AAC ACT GTG GTC GGT CAT CTT CTT CAT TCC TCA ATT CTG GTC CCA 697 
Leu Asn Ser Vol Vol Gly His Leu Leu His Ser Ser He Leu Vol Pro 
180 185 190 

TAC CAT GGC TGG AGA ATT AGT CAC AGA ACT CAC CAC CAG AAC CAT GGA 745 
Tyr His Gly Trp Arg He Ser His Arg Thr His His Gin Asn His Gly 
195 200 205 

CAT GTT GAG AAT GAC GAA TCT TGG CAT CCT ATG TCT GAG AAA ATC TAC 793 
His Vol Glu Asn Asp Glu Ser Trp His Pro Mel Ser Glu Lys He Tyr 
210 215 220 

AAT ACT TTG GAC AAG CCG ACT AGA TTC TTT AGA TTT ACA CTG CCT CTC 841 
Asn Thr Leu Asp Lys Pro Thr Arg Phe Phe Arg Phe Thr Leu Pro Leu 
225 230 235 

GTG ATG CTT GCA TAC CCT TTC TAC TTG TGG GCT CGA AGT CCG GGG AAA 889 
Vol Mel Leu Alo Tyr Pro Phe Tyr Leu Trp Alo Arg Ser Pro Giy Lys 
240 245 250 255 

AAG GGT TCT CAT TAC CAT CCA GAC AGT GAC TTG TTC CTC CCT AAA GAG 937 
Lys Gly Ser His Tyr His Pro Asp Ser Asp Leu Phe Leu Pro Lys Glu 
260 265 270 

AGA AAG GAT GTC CTC ACT TCT ACT GCT TGT TGG ACT GCA ATG GCT GCT 985 
Arg Lys Asp Vol Leu Thr Ser Thr Alo Cys Trp Thr Alo Met Alo Alo 
275 280 285 

CTG CTT GTT TGT CTC AAC TTC ACA ATC GGT CCA ATT CAA ATG CTC AAA 1033 
Leu Leu Vol Cys Leu Asn Phe Thr He Gly Pro He Gin Mel Leu Lys 
290 295 300 
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CTT TAT GGA ATT CCT TAC TGG ATA AAT GTA ATG TGG TTG GAC TTT GIG 1081 
Leu Tyr Gly lie Pro Tyr Trp He Asn Vol Mel Trp Leu Asp Phe Vol 
305 310 315 

ACT TAC CTG CAT CAC CAT GGT CAT GAA GAT AAG CTT CCT TGG TAC CGT 1129. 
Thr Tyr Leu His His His Gly His Glu Asp Lys Leu Pro Trp Tyr Arg 
320 325 330 335 

GGC AAG GAG TGG AGT TAC CTG AGA GGA GGA CTT ACA ACA TTG GAT CGT 1 177 
Gly Lys Glu Trp Scr Tyr Leu Arg Gly Gly Leu Thr Thr Leu Asp Arg 
340 345 350 

GAC TAC GGA TTG ATC AAT AAC ATC CAT CAT GAT ATT GGA ACT CAT GTG 1225 
Asp Tyr Gly Leu Me Asn Asn He His His Asp He Gly Thr His Vol 
355 360 365 

ATA CAT CAT CTT TTC COG CAG ATC CCA CAT TAT CAT CTA GTA GAA GCA 1273 
He His His Leu Phe Pro Gin He Pro His Tyr His Leu Vol Glu Alo 
370 375 380 

ACA GAA GCA GCT AAA CCA GTA TTA GGG AAG TAT TAC AGG GAG CCT GAT 1321 
Thr Glu Alo Alo Lys Pro Vol Leu Gly Lys Tyr Tyr Arg Glu Pro Asp 
385 390 395 

AAG TCT GGA CCG TTG CCA TTA CAT TTA CTG GAA ATT CTA GGG AAA AGT 1369 
Lys Ser Gly Pro Leu Pro Leu His Leu Leu Glu He Leu Alo Lys Ser 
400 405 410 415 

ATA AAA GAA GAT CAT TAC GTG AGC GAC GAA GGA GAA GTT GTA TAC TAT 1417 
He Lys Glu Asp His Tyr Vol Ser Asp Glu Gly Glu Vol Vol Tyr Tyr 
420 425 430 

AAA GCA GAT CCA AAT CTC TAT GGA GAG GTC AAA GTA AGA GCA GAT TGAAATGAAG 1472 
Lys Alo Asp Pro Asn Leu Tyr Gly Glu Vol Lys Vol Arg Alo Asp 
435 440 445 

CAGGCTTGAG ATTGAAGTTT TTTCTATTTC AGACCAGCTG ATTTTTTGCT TACTGTATCA 1532 

ATTTATTGTG TCACCCACCA CAGAGTTAGT ATCTCTGAAT ACGATCGATC AGATGGAAAC 1592 

AACAAATTTG TTTGCGATAC TGAAGCTATA TATACCATAA AAAAAAAAAA AAA 1645 
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Met Alo Asn Leu Vol Leu Ser Glu Cys Gly He Arg Pro Leu Pro Arg 
1 5 10 15 

lie Tyr Thr Thr Pro Arg Ser Asn Phe Leu Ser Asn Asn Asn Lys Phe 
20 25 30 

Arg Pro Ser Leu Ser Ser Ser Ser Tyr Lys Thr Ser Ser Ser Pro Leu 
35 40 45 

Ser Phe Gly Leu Asn Ser Arg Asp Gly Phe Thr Arg Asn Trp Alo Leu 
50 55 60 

Asn Vol Ser Thr Pro Leu Thr Thr Pro lie Phe Glu Glu Ser Pro Leu 
65 70 75 80 

Glu Glu Asp Asn Lys Gin Arg Phe Asp Pro Gly Alo Pro Pro Pro Phe 
85 90 95 

Asn Leu Alo Asp He Arg Alo Alo He Pro Lys His Cys Trp Vol Lys 
too 105 110 

Asn Pro Trp Lys Ser Leu Ser Tyr Vol Vol Arg Asp Vol Alo He Vol 
115 120 125 

Phe Alo Leu Alo Ale Gly Alo Alo Tyr Leu Asn Asn Trp He Vol Trp 
130 135 140 

Pro Leu Tyr Trp Leu Alo Gin Gly Thr Met Phe Trp Alo Leu Phe Vol 
145 150 155 160 

Leu Gly His Asp Cys Gly His Gly Ser Phe Ser Asn Asp Pro Lys Leu 
165 170 175 

Asn Ser Vol Vol Gly His Leu Leu His Ser Ser He Leu Vol Pro Tyr 
180 185 190 

His Gly Trp Arg He Ser His Arg Thr His His Gin Asn His Gly His 
195 200 205 

Vol Glu Asn Asp Glu Ser Trp His Pro Mel Ser Glu Lys He Tyr Asn 
210 215 220 
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Thr Leu Asp Lys Pro Thr Arg Phe Phe Arg Phe Thr Leu Pro Leu Vol 
225 230 235 240 

Met Leu Alo Tyr Pro Phe Tyr Leu Trp Alo Arg Ser Pro Gly Lys Lys 
245 250 255 

Gly Ser His Tyr His Pro Asp Ser Asp Leu Phe Leu Pro Lys Glu Arg 
260 265 270 

Lys Asp Vol Leu Thr Ser Thr Alo Cys Trp Thr Alo Met Alo Alo Leu 
275 280 285 

Leu Vol Cys Leu Asn Phe Thr lie Gly Pro He Gin Met Leu Lys Leu 
290 295 300 

Tyr Gly He Pro Tyr Trp lie Asn Vol Met Trp Leu Asp Phe Vol Thr 
305 310 315 320 

Tyr Leu His His His Gly His Glu Asp Lys Leu Pro Trp Tyr Arg Gly 
325 330 335 

Lys Glu Trp Ser Tyr Leu Arg Gly Gly Leu Thr Thr Leu Asp Arg Asp 
340 345 350 

Tyr Gly Leu He Asn Asn He His His Asp He Gly Thr His Vol He 
355 360 365 

His His Leu Phe Pro Gin He Pro His Tyr His Leu Vol Glu Alo Thr 
370 375 380 

Glu Alo Alo Lys Pro Vol Leu Gly Lys Tyr Tyr Arg Glu Pro Asp Lys 
385 390 395 400 

Ser Gly Pro Leu Pro Leu His Leu Leu Glu He Leu Alo Lys Ser He 
405 410 415 

Lys Glu Asp His Tyr Vol Ser Asp Glu Gly Glu Vol Vol Tyr Tyr Lys 
420 425 430 

Alo Asp Pro Asn Leu Tyr Gly Glu Vol Lys Vol Arg Alo Asp 
435 440 445 
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AGAGAGTGCA AATAGAAOGA CAGAGACTTT TTCCTCTTTT CTTCTTGGGA AGAGGCTCCA 60 

ATG GCG AGC TCG GTT TTA TCA GAA TGT GGT TTT AGA CCT CTC CCC AGA 108 
Met Alo Ser Ser Vol Leu Scr Glu Cys Gly Phe Arg Pro Leu Pro Arg 
15 10 15 

TTC TAC CCT AAA CAC ACA ACC TCT TTT GCG TCT AAC CCT AAA CCC ACT 156 
Phe Tyr Pro Lys His Thr Thr Ser Phe Alo Ser Asn Pro Lys Pro Thr 
20 25 30 

TTC AAA TTC AAT CCA CCA CTT AAA CCT CCT TCT TCT CTT CTC AAT TCC 204 
Phe Lys Phe Asn Pro Pro Leu Lys Pro Pro Ser Ser Leu Leu Asn Ser 
35 40 45 

CGA TAT GGA TTC TAC TCT AAA ACC AGO AAC TGG GCA TTG AAT GTG GCA 252 
Arg Tyr Gly Phe Tyr Ser Lys Thr Arg Asn Trp Alo Leu Asn Vol Alo 
50 55 60 

ACA CCT TTA ACA ACT CTT CAG TCT CCA TCC GAG GAA GAC ACG GAG AGA 300 
Thr Pro Leu Thr Thr Leu Gin Ser Pro Ser Glu Glu Asp Thr Glu Arg 
65 70 75 80 

TTC GAG CCA GGT GCG CCT CCT CCC TTC AAT TTG GOG GAT ATA AGA GCA 348 
Phe Asp Pro Gly Alo Pro Pro Pro Phe Asn Leu Alo Asp He Arg Alo 
85 90 95 

GCG ATA CCT AAG CAT TGT TGG GTT AAG AAT CCA TGG ATG TCT ATG AGT 396 
Alo lie Pro Lys His Cys Trp Vol Lys Asn Pro Trp Mel Ser Mel Ser 
100 105 110 

TAT GTT GTC AGA GAT GTT GOT ATC GTC TTT GGA TTG GCT GCT GTT GCT 444 
Tyr Vol Vol Arg Asp Vol Alo lie Vol Phe Gly Leu Alo Alo Vol Alo 
115 120 125 

GCT TAC TTC AAC AAT TGG CTT CTC TGG CCT CTC TAC TGG TTC GCT CAA 492 
Alo Tyr Phe Asn Asn Trp Leu Leu Trp Pro Leu Tyr Trp Phe Alo Gin 
130 135 140 

GGA ACC ATG TTC TGG GCT CTC TTT GTC CTT GGC CAT GAC TGC GGA CAT 540 
Gly Thr Mel Phe Trp Alo Leu Phe Vol Leu Gly His Asp Cys Gly His 

145 150 155 160 
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GGT AGO TTC TCG AAT GAT COG AGG CTG AAC AGT GTG GOT GGT CAT CTT 588 
Gly Ser Phe Ser Asn Asp Pro Arg Leu Asn Ser Vol Alo Gly His Leu 
165 170 175 

CTT CAT TCC TCA ATT CTG GTC CCT TAC CAT GGC TGG AGG ATT AGC CAC 636 
Leu His Ser Ser lie Leu Vol Pro Tyr His Gly Trp Arg He Ser His 
180 185 190 

AGA ACT CAC CAC CAG AAC CAT GGT CAT GTC GAG AAT GAC GAA TCA TGG 684 
Arg Thr His His Gin Asn His Gly His Vol Glu Asn Asp Glu Ser Trp 
195 200 205 

CAT CCT TTG CCT GAA AGC ATC TAC AAG AAT TTG GAA AAG ACG ACT CAA 732 
His Pro Leu Pro Glu Ser He Tyr Lys Asn Leu Glu Lys Thr Thr Gin 
210 215 220 

ATG TTT AGG TTT ACA CTG CCT TTT CCA ATG CTC GCA TAC CCT TTC TAC 780 
Met Phe Arg Phe Thr Leu Pro Phe Pro Mel Leu Alo Tyr Pro Phe Tyr 
225 230 235 240 

TTG TGG AAC AGA AGT CCA GGG AAA CAA GGT TCT CAT TAT CAT CCG GAC 828 
Leu Trp Asn Arg Ser Pro Gly Lys Gin Gly Ser His Tyr His Pro Asp 
245 250 255 

AGT GAC TTG TTT CTT CCA AAA GAG AAG AAA GAT GTT CTG ACA TCA ACT 876 
Ser Asp Leu Phe Leu Pro Lys Glu Lys Lys Asp Vol Leu Thr Ser Thr 
260 265 270 

GCC TGT TGG ACT GCA ATG GCT GCT TTG CTT GTT TGT CTC AAC TTT GTC 924 
Alo Cys Trp Thr Alo Mel Alo Alo Leu Leu Vol Cys Leu Asn Phe Vol 
275 280 285 

ATG GGT CCA ATC CAG ATG CTC AAA CTA TAT GGC ATC CCT TAT TGG ATA 972 
Met Gly Pro He Gin Mel Leu Lys Leu Tyr Gly He Pro Tyr Trp He 
290 295 300 

TTT GTA ATG TGG TTG GAC TTC GTC ACT TAC TTG CAC CAC CAT GGA CAT 1020 
Phe Vol Met Trp Leu Asp Phe Vol Thr Tyr Leu His His His Gly His 
305 310 315 320 



FIG.12b 

RECTIFIED SHEET (RULE 91) 
ISA/EP 



wo 94/18337 



FCTAJS94/01321 



23/25 

GAA GAC AAG CTC CCT TGG TAT CGT GGA AAG GAA TOG AGT TAC CTG AGA 1068 

Glo Asp Lys Leu Pro Trp Tyr Arg Gly Lys Glu Trp Ser Tyr Leu Arg 
325 330 335 

GGA GGG CTC ACA ACA TTA GAT CGT GAC TAC GGA TGG ATC AAT AAC ATC 11.16 
Gly Gly Leu Thr Thr Leu Asp Arg Asp Tyr Gly Trp He Asn Asn He 
340 345 350 

CAC CAC GAT ATT GGA ACT CAT GTG ATA CAT CAT CTT TTC CCG CAG ATC 1164 
His His Asp I le Gly Thr His Vol He His His Leu Phe Pro Gin He 
355 360 365 

CCA CAT TAT CAT CTA GTA GAA GCA ACA GAA GCA GCT AAA CCA GTA CTA 1212 
Pro His Tyr His Leu Vol Glu Alo Thr Glu Alo Alo Lys Pro Vol Leu 
370 375 380 

GGA AAG TAC TAC AGA GAA CCG AAA AAC TCT GGA CCT CTG CCA CTT CAC 1260 
Gly Lys Tyr Tyr Arg Glu Pro Lys Asn Ser Gly Pro Leu Pro Leu His 
385 390 395 400 

TTA CTG GGA AGC CTC ATA AAG AGT ATG AAA CAA GAC CAT TTC GTA AGC 1308 
Leu Leu Gly Ser Leu He Lys Ser Met Lys Gin Asp His Phe Vol Ser. 

405 410 415 

GAT ACA GGA GAT GTC GTG TAC TAT GAG GCA GAT CCA AAA CTC AAT GGA 1355 
Asp Thr Gly Asp Vol Vol Tyr Tyr Glu Alo Asp Pro Lys Leu Asn Gly 
420 425 430 

CAA AGA ACA TGAGGACATA CTGCAGTGAA CCAGGCAGAC AAGTTACATA 1405 
Gin Arg Thr 
435 

AATTCATCTT GGCCCATTCA TTATGTTCTT TTTGTTTTGG TGTAAAGCCT TTTCGAGATT 1465 
AAAAAAGCAT TAATTTGTAG AAAGCTGTGG TAAAACTCTC GATCAAATGA AATA«5ATAT 1525 
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Met Alo Ser Ser Vol Leu Ser Glu Cys Gly Phe Arg Pro Leu Pro Arg 
15 10 15 

Phe Tyr Pro Lys His Thr Thr Ser Phe Aid Ser Asn Pro Lys Pro Thr 
20 25 30 

Phe Lys Phe Asn Pro Pro Leu Lys Pro Pro Ser Ser Leu Leu Asn Ser 
35 40 45 

Arg Tyr Gly Phe Tyr Ser Lys Thr Arg Asn Trp Alo Leu Asn Vol Alo 
50 55 60 

Thr Pro Leu Thr Thr Leu Gin Ser Pro Ser Glu Glu Asp Thr Glu Arg 
65 70 75 80 

Phe Asp Pro Gly Alo Pro Pro Pro Phe Asn Leu Alo Asp He Arg Alo 
85 90 95 

Alo He Pro Lys His Cys Trp Vol Lys Asn Pro Trp Met Ser Mel Ser 
100 105 110 

Tyr Vol Vol Arg Asp Vol Alo He Vol Phe Gly Leu Alo Alo Vol Alo 
115 120 125 

Alo Tyr Phe Asn Asn Trp Leu Leu Trp Pro Leu Tyr Trp Phe Alo Gin 
130 135 140 

Gly Thr Met Phe Trp Alo Leu Phe Vol Leu Gly His Asp Cys Gly His 
145 150 155 160 

Gly Ser Phe Ser Asn Asp Pro Arg Leu Asn Ser Vol Alo Gly His Leu 
165 170 175 

Leu His Ser Ser He Leu Vol Pro Tyr His Gly Trp Arg He Ser His 
180 185 190 

Arg Thr His His Gin Asn His Gly His Vol Glu Asn Asp Glu Ser Trp 
195 200 205 

His Pro Leu Pro Glu Ser He Tyr Lys Asn Leu Glu Lys Thr Thr Gin 
210 215 220 
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Mel Phe Arg Phe Thr Leu Pro Phe Pro Mel Leu Alo Tyr Pro Phe Tyr 
225 230 235 240 

Leu Trp Asn Arg Ser Pro Gly Lys Gin Gly Ser His Tyr His Pro Asp 
245 250 255 

Ser Asp Leu Phe Leu Pro Lys Glu Lys Lys Asp Vol Leu Thr Ser Thr 
260 265 270 

Alo Cys Trp Thr Alo Mel Alo Alo Leu Leu Vol Cys Leu Asn Phe Vol 
275 280 285 

Mel Gly Pro He Gin Mel Leu Lys Leu Tyr Gly lie Pro Tyr Trp He 
290 295 300 

Phe Vol Mel Trp Leu Asp Phe Vol Thr Tyr Leu His His His Gly His 
305 310 315 320 

Glu Asp Lys Leu Pro Trp Tyr Arg Gly Lys Glu Trp Ser Tyr Leu Arg 
325 330 335 

Gly Gly Leu Thr Thr Leu Asp Arg Asp Tyr Gly Trp He Asn Asn He 
340 345 350 

His His Asp He Gly Thr His Vol He His His Leu Phe Pro Gin He 
355 360 365 

Pro His Tyr His Leu Vol Glu Alo Thr Glu Alo Alo Lys Pro Vol Leu 
370 375 380 

Glv Lys Tyr Tyr Arg Glu Pro Lys Asn Ser Gly Pro Leu Pro Leu His 
385 390 395 400 

Leu Leu Gly Ser Leu He Lys Ser Mel Lys Gin Asp His Phe Vol Ser 
405 410 415 

Asp Thr Gly Asp Vol Vol Tyr Tyr Glu Alo Asp Pro Lys Leu Asn Gly 
420 425 430 

Gin Arg Thr 



435 



FIG. 13b 
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AM nTEOSIN 5' REGULATORY REGION FOR THE 

seed oil content has traditionally been 
modified by plant breeding. The use of recombinant 
DNA technology to alter seed oil composition can 
accelerate this process and in some cases alter seed 
oils in a way that cannot be accomplished by breeding 
alone The oil composition of Brassica has been 
significantly altered by modifying the expression of a 
number of lipid metabolism genes. Such manipulations 
of seed oil composition have focused on altering the 
proportion of endogenous component fatty acids. For 
example, antisense repression of the Al2-desaturase 
gene in transgenic rapeseed has resulted in an 
increase in oleic acid of up to 83%. Topfer et al . 
1995 Science 258:681-686. 

There have been some successful attempts at 
modifying the composition of seed oil in transgenic 
plants by introducing new genes that allow the 
production of a fatty acid that the host plants were 
not previously capable of synthesizing. Van de Loo. 
et al. (1995 Proc. Natl. Acad. Sci USA 92:6743-6747) 
have been able to introduce a A12 -hydroxylase gene 
into transgenic tobacco, resulting in the introduction 
of a novel fatty acid, ricinoleic acid, into its seed 
oil. The reported accumulation was modest from plants 
carrying constructs in which transcription of the 
hydroxylase gene was under the control of the 
cauliflower mosaic virus (CaMV) 35S promoter. 
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Similarly, tobacco plants have been engineered to 
produce low levels of petroselinic acid by expression 
of an acyl-ACP desaturase from coriander (Gaboon et 
al. 1992 Proc. Natl. Acad. Sci USA 8P..11184-11188) . 

The long chain fatty acids (CIS and larger) , 
have significant economic value both as nutritionally 
and medically important foods and as industrial 
commodities (Ohlrogge, J.B. 1994 Plant Physiol. 
104:821-826). Linoleic (18:2 A9, 12) and a-linolenic 
acid (18:3 A9,12,15) are essential fatty acids found 
in many seed oils. The levels of these fatty- acids 
have been manipulated in oil seed crops through 
breeding and biotechnology (Ohlrogge, et al . 1991 
Biochim. Biophys. Acta 1082:1-26; Topfer et al . 1995 
Science 681 -686) . Additionally, the production of 

novel fatty acids in seed oils can be of considerable 
use in both human health and industrial applications. 

Consumption of plant oils rich in y- 
linolenic acid (GLA) (18:3 A6,9,12) is thought to 
alleviate hypercholesterolemia and other related 
clinical disorders which correlate with susceptibility 
to coronary heart disease (Brenner R.R. 1976 Adv. Exp. 
Med. Biol. 83:85-101). The therapeutic benefits of 
dietary GLA may result from its role as a precursor to 
prostaglandin synthesis (Weete, J.D. 19 80 in Lipid 
Biochemistry of Fungi and Other Organisms, eds. Plenum 
Press, New York, pp. 59-62). Linoleic acid(18:2) (LA) 
is transformed into gamma linolenic acid (18:3) (GLA) 
by the enzyme A6 -desaturase. 
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Few seed oils contain GLA despite high 
contents of the precursor linoleic acid. This is due 
to the absence of A6 -desaturase activity in most 
plants. For example, only borage {Borago 
officinalis) , evening primrose {Oenothera biennis) . 
and currants {Rihes nigrum) produce appreciable 
amounts of linolenic acid. Of these three species, 
only Oenothera and Borage are cultivated as a 
commercial source for GLA. It would be beneficial if 
agronomic seed oils could be engineered to produce GLA 
in significant quantities by introducing a 
heterologous a6 -desaturase gene. It would also be 
beneficial if other expression products associated 
with fatty acid synthesis and lipid metabolism could 
be produced in plants at high enough levels so that 
commercial production of a particular expression 
product becomes feasible. 

As disclosed in U.S. Patent No. 5,552,306, a 
cyanobacterial a"^ - desaturase gene has been recently 
isolated. Expression of this cyanobacterial gene in 
transgenic tobacco resulted in significant but low 
level GLA accumulation. (Reddy et al . 1996 Nature 
Biotech. 14:639-642). Applicant's copending U.S. 
Application serial No. 08,366,779, discloses a A6- 
desaturase gene isolated from the plant Borago 
officinalis and its expression in tobacco under the 
control of the CaMV 3 58 promoter. Such expression 
resulted in significant but low level GLA and 
octadecatetraenoic acid (ODTA or OTA) accumulation in 
seeds. Thus, a need exists for a promoter which 



wo 98/45461 



-4- 



PCTAJS98/07179 



functions in plants and which consistently directs 
high level expression of lipid metabolism genes in 
transgenic plant seeds. 

Oleosins are abundant seed proteins 
associated with the phospholipid monolayer membrane of 
oil bodies. The first oleosin gene, L3 , was cloned 
from maize by selecting clones whose in vitro 
translated products were recognized by an anti-L3 
antibody (Vance et al . 1987 J. Biol. Chem. 262:11275- 
11279) . subsequently, different isoforms of oleosxn 
genes from such different species as Brassica, 
soybean, carrot, pine, and Arabidopsis have been 
cloned (Huang, A.H.C., 1992, Ann. Reyriews Plant Phys . 
and Plant Mol. Biol. 43:177-200; Kirik et al . , 1996 
Plant Mol. Biol. 31:413-417; Van Rooijen et al., 1992 
Plant Mol. Biol. 15:1177-1179; Zou et al . , Plant Mol. 
Biol. 31:429-433. Oleosin protein sequences predicted 
from these genes are highly conserved, especially for 
the central hydrophobic domain. All of these oleosins 
have the characteristic feature of three distinctive 
domains. An amphipathic domain of 40-60 amino acids 
is present at the N- terminus; a totally hydrophobic 
domain of 68-74 amino acids is located at the center; 
and an amphipathic a-helical domain of 33-40 amino 
acids is situated at the c-terminus (Huang, A.H.C. 

The present invention provides 5' regulatory 
sequences from an oleosin gene which direct high level 
expression of lipid metabolism genes in transgenic 
plants. in accordance with the present invention. 
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chimeric constructs comprising an oleosin 5- 
regulatory region operably linked to coding sequence 
for a lipid metabolism gene such as .a A6 -desaturase 
gene are provided. Transgenic plants comprising the 
subject chimeric constructs produce levels of GLA 
approaching the level found in those few plant specxes 
which naturally produce GLA such as evening primrose 
(Oenothera biennis) . 

The present invention is directed to 5 
regulatory regions of an Arabidopsis oleosin gene. 
The 5' regulatory regions, when operably linked to 
either the coding sequence of a heterologous gene or 
sequence complementary to a native plant gene, direct 
expression of the heterologous gene or complementary 
sequence in a plant seed. 

The present invention thus provides 
expression cassettes and expression vectors comprising 
an oleosin 5- regulatory region operably linked to a 
heterologous gene or a sequence complementary to a 

native, plant gene. 

Plant transformation vectors comprising the 

expression cassettes and expression vectors are also 
provided as are plant cells transformed by these 
vectors, and plants and their progeny containing the 
vectors • 

in one embodiment of the invention, the 
heterologous gene or complementary gene sequence is a 
fatty acid synthesis gene or a lipid metabolism gene. 
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In another aspect of the present invention, 
a method is provided for producing a plant with 
increased levels of a product of a fatty acid 
synthesis or lipid metabolism gene. 

In particular, there is provided a method 
for producing a plant with increased levels of a fatty 
acid synthesis or lipid metabolism gene by 
transforming a plant with the subject expression 
cassettes and expression vectors which comprise an 
oleosin 5 ' regulatory region and a coding sequence for 
a fatty acid synthesis or lipid metabolism gene. 

In another aspect of the present invention, 
there is provided a method for cosuppressing a native 
fatty acid synthesis or lipid metabolism gene by 
transforming a plant with the subject expression 
cassettes and expression vectors which comprise an 
oleosin 5' regulatory region and a coding sequence for 
a fatty acid synthesis or lipid metabolism gene. 

A further aspect of this invention provides 
a method of decreasing production of a native plant 
gene such as a fatty acid synthesis gene or a lipid 
metabolism gene by transforming a plant with an 
expression vector comprising a oleosin 5' regulatory 
region operably linked to a nucleic acid sequence 
complementary to a native plant gene- 

Also provided are methods of modulating the 
levels of a heterologous gene such as a fatty acid 
synthesis or lipid metabolism gene by transforming a 
plant with the subject expression cassettes and 
expression vectors. 
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jx-o jy^v nKSCRi pTTnK of THK DRAWINgS 

Fig. 1 depicts the nucleotide and 
corresponding amino acid sequence of the borage A6- 
desaturase gene (SEQ ID NO:l) . The cytochrome b5 
heme-binding motif is boxed and the putative metal 
binding, histidine rich motifs (HRMs) are underlined. 
The motifs recognized by the primers (PGR analysis) 
are underlined with dotted lines, i.e. tgg aaa tgg aac 
cat aa; and gag cat cat ttg ttt cc. 

Fig. 2 is a dendrogram showing similarity of 
the borage a6 - desaturase to other membrane -bound 
desaturases. The amino acid sequence of the borage A6- 
desaturase was compared to other known desaturases 
using Gene Works (intelliGenetics) . Numerical values 
correlate to relative phylogenetic distances between 

subgroups compared. 

Fig. 3A provides a gas liquid chromatography 
profile of the fatty acid methyl esters (FAMES) 
derived from leaf tissue of a wild type tobacco 
'xanthi' . 

Fig. 3B provides a gas liquid chromatography 
profile of the FAMES derived from leaf tissue of a 
tobacco plant transformed with the borage A6- 
desaturase cDNA under transcriptional control of the 
CaMV 3 55 promoter (pAN2) . Peaks corresponding to 
methyl linoleate (18:2), methyl ylinolenate (18:3y)» 
methyl a-linolenate (18:3a), and methyl 
octadecatetraenoate (18:4) are indicated. 
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Fig. 4 is the nucleotide sequence and 
corresponding amino acid sequence of the pleosin AtS21 
CDNA (SEQ ID NO: 3) . 

Fig. 5 is an acidic -base map of the 
predicted AtS21 protein generated by DNA Strider 1.2. 

Fig. 6 is a Kyte-Doolittle plot of the 
predicted AtS21 protein generated by DNA Strider 1.2. 

Fig. 7 is a sequence alignment of oleosins 
isolated from Arabidopsxs . Oleosin sequences 
published or deposited in EMBL, BCM, NCBI databases 
were aligned to each other using GeneWorks® 2.3. 
Identical residues are boxed with rectangles. The 
seven sequences fall into three groups. The first 
group includes AtS21 (SEQ ID NO:5), X91918 (SEQ ID 
NO:6) and Z29859 (SEQ ID N0:7). The second group 
includes X62352 (SEQ ID NO:B) and Atol3 (SEQ ID NO:9). 
The third group includes X91956 (SEQ ID NO: 10) and 
L40954 (SEQ ID NO:ll). Differences in amino acid 
residues within the same group are indicated by 
shadows. Ato2/Z54164 is identical to AtS21. Atol3 
sequence (Accession No. Z541654 in EMBL database) is 
actually not disclosed in the EMBL database. The 
Z54165 Accession number designates the same sequence 
as Z54164 which is Atol2 . 

Fig. 8A is a Northern analysis of the AtS21 
gene. An RNA gel blot containing ten micrograms of 
total RNA extracted from Arabidopsis flowers (F) , 
leaves (L) , roots (R) , developing seeds (Se) , and 
developing silique coats (Si) was hybridized with a 
probe made from the full-length AtS21 cDNA. 
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Fig 8B is a southern analysis of the AtS21 
gene. A DNA gel blot containing ten -i-ograms of 
genomic DNA digested with BamHI (B) . ^^^^ 
Hindlll (H) . sad (S) , and Xbal (X) was hybrxdxzed 
"ith a proL made fro. the full length AtS21 cDNA^ 
Fig 9 is the nucleotide sequence of the 
sad fragment of AtS21 genomic DNA (SEQ ID NO:12K 
IZ promoter and intron sequences are in uppercase 

fragments corresponding to AtS21 cDNA sequence are 
In lower case. The first ATG codon and a putatxve 
;LaTox are shadowed. The ^-^^ --/"J,::, 
21P primer for PGR amplification is boxed. A P^^atxve 
abLiric acid response element (ABRE) and two 14 bp 
.epeats a- ^erlined- ^^^^^ ,_..er/aUS 

construct (pAN5) ^^^^^^^ ^^^^^^^^^ ^^^^ expression 

in Arabidopsis bolt and leaves. 

Fig. IIB depicts AtS21 GUS gene expression 

in Arabidopsis siliques. 

Fig. lie depicts AtS21 GUS gene expression 

in Arabidopsis developing seeds. 

^^oression in Arabidopsis developing embryos. 
expressxon^_^ IIK depicts AtS21/GUS gene expression 

in Arabidopsis root and root hairs of a young 
seedling. AtS21/GUS gene expression 

in Arabidopsis cotyledons and the shoot apex of a five 
day seedling. 
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Figs. IIM and UN depict AtS21/GUS gene 
expression in Arabidopsis cotyledons and the shoot 
apex of 5-15 day seedlings. 

Fig. 12A depicts AtS21/GUS gene expression 

in tobacco embryos and endosperm. 

Fig. 12B depicts AtS21/GUS gene expression 

in germinating tobacco seeds. 

Fig. 12C depicts AtS21/GUS gene expression 
in a 5 day old tobacco seedling. 

Fig. 12D depicts AtS21/GUS gene expression 
in 5-15 day old tobacco seedlings. 

Fig. 13A is a Northern analysis showing 
AtS21 mRNA levels in developing wild- type Arabidopsis 
seedlings. Lane 1 was loaded with RNA from developing 
seeds, lane 2 was loaded with RNA from seeds imbibed 
for 24-48 hours, lane 3 : 3 day seedlings; lane 4: 4 
day seedlings; lane 5: 5 day seedlings; lane 6: 6 day 
seedlings; lane 7 ; 9 day seedlings; lane 8: 12 day 
seedlings. Probe was labeled AtS21 cDNA. Exposure 
was for one hour at -80°C. 

Fig. 13B is the same blot as Fig. 13A only 
exposure was for 24 hours at -80°C. 

Fig. 13C is the same blot depicted m Figs. 
13A and 13B after stripping and hybridization with an 
Arabidopsis tubulin gene probe. The small band m 
each of lanes 1 and 2 is the remnant of the previous 
AtS21 probe. Exposure was for 48 hours at -80°C. 

Fig. 14 is a graph comparing GUS activities 
expressed by the AtS21 and 3 5S promoters. GUS 
activities expressed by the AtS21 promoter in 



wo 98/45461 



-11- 



PCTAJS98/07179 



developing Arabidopsis seeds and leaf are plotted side 
by side with those expressed by the 35S promoter. The 
GUS activities expressed by the AtS21 promoter in 
tobacco dry seed and leaf are plotted on the right 
side of the figure. GUS activity in tobacco leaf is 
so low that no column appears. "G-H" denotes globular 
to heart stage; "H-T" denotes heart to torpedo stage; 
"T-C" denotes torpedo to cotyledon stage; "Early C" 
denotes early cotyledon; "Late C" denotes late 
cotyledon. The standard deviations are listed in 
Table 2. 

Fig. 15A is an RNA gel blot analysis carried 
out on 5 pg samples of RNA isolated from borage leaf, 
root, and 12 dpp embryo tissue, using labeled borage 
A6-desaturase cDNA as a hybridization probe. 

Fig. 15B depicts a graph corresponding to 
the Northern analysis results for the experiment shovm 

in Pig. 15A. 

Fig. 16A is a graph showing relative legumin 
RNA accxamulation in developing borage embryos based on 
results of Northern blot. 

Fig. 16B is a graph showing relative 
oleosin RNA accumulation in developing borage embryos 
based on results of Northern blot. 

Fig. 16C is a graph showing relative A6- 
desaturase RNA accumulation in developing borage 
embryos based on results of Northern blot. 

Fig. 17 is a PGR analysis showing the 
presence of the borage delta 6-desaturase gene in 
transformed plants of oilseed rape. Lanes 1, 3 and 4 
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were loaded with PGR reactions performed with DNA from 
plants transformed with the borage delta 6 -desaturase 
gene linked to the oleosin 5' regulatory region; lane 
2: DNA from plant transformed with the borage delta 6- 
desaturase gene linked to the albumin 5' regulatory 
region; lanes 5 and 6: DNA from non- transformed 
plants; lane 7: molecular weight marker (1 kb ladder. 
Gibco BRL) ; lane 8: PGR without added template DNA; 
lane 9: control with DNA from Agrrobacterium 
tumefaciens EHA 105 containing the plasmid pAN3 (i.e. 
the borage del ta6 -desaturase gene linked to the 
oleosin 5 ' regulatory region) . 

mg'PATI.ED pwsrRTPTTnw nP THTC TNVENTIOW 

The present invention provides isolated 
nucleic acids encoding 5' regulatory regions from an 
Arabidopsis oleosin gene. In accordance with the 
present invention, the subject 5' regulatory regions, 
when operably linked to either a coding sequence of a 
heterologous gene or a sequence complementary to a 
native plant gene, direct expression of the coding 
sequence or complementary sequence in a plant seed. 
The oleosin 5' regulatory regions of the present 
invention are useful in the construction of an 
expression cassette which comprises in the 5' to 3 ' 
direction, a subject oleosin 5- regulatory region, a 
heterologous gene or sequence complementary to a 
native plant gene under control of the regulatory 
region and a 3' termination sequence. Such an 
expression cassette can be incorporated into a variety 
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Of autonomously replicating vectors in order to 
construct an expression vector. 

It has been surprisingly found that plants 
transformed with the expression vectors of the present 
invention produce levels of GLA approaching the level 
found in those few plant species which naturally 
produce GLA such as evening primrose (OenothBra 
hiennis) . 

AS used herein, the term "cassette" refers 
to a nucleotide sequence capable of expressing a 
particular gene if said gene is inserted so as to be 
operably linked to one or more regulatory regions 
present in the nucleotide sequence. Thus, for 
example, the expression cassette may comprise a 
heterologous coding sequence which is desired to be 
expressed in a plant seed. The expression cassettes 
and expression vectors of the present invention are 
therefore useful for directing seed- specif ic 
expression of any number of heterologous genes. The 
term "seed- specif ic expression" as used herein, refers 
to expression in various portions of a plant seed such 
as the endosperm and embryo. 

An isolated nucleic acid encoding a 5' 
regulatory region from an oleosin gene can be provided 
as follows. Oleosin recombinant genomic clones are 
isolated by screening a plant genomic DNA library with 
a cDNA (or a portion thereof) representing oleosin 
mRNA. A number of different oleosin cDNAs have been 
isolated. The methods used to isolate such cDNAs as 
well as the nucleotide and corresponding amino acid 
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sequences have been published in Kirik et al . 1986 
Plant Mol. Biol. 31:413-417; Zou et al . Plant Mol. 
Biol. 31:429-433; Van Rooigen et al. 1992 Plant Mol. 
Biol. 15:1177-1179. 

Virtual subtraction screening of a tissue 
specific library using a random primed polymerase 
chain (RP-PCR) cDNA probe is another method of 
obtaining an oleosin cDNA useful for screening a plant 
genomic DNA library. Virtual subtraction screening 
refers to a method where a cDNA library is constructed 
from a target tissue and displayed at a low density so 
that individual cDNA clones can be easily separated. 
These cDNA clones are subtract ively screened with 
driver quantities (i.e., concentrations of DNA to 
kinetically drive the hybridization reaction) of cDNA 
probes made from tissue or tissues other than the 
target tissue (i.e. driver tissue). The hybridized 
plaques represent genes that are expressed in both the 
target and the driver tissues; the unhybridized 
plaques represent genes that may be target tissue - 
specific or low abundant genes that can not be 
detected by the driver cDNA probe. The unhybridized 
cDNAs are selected as putative target tissue- specif ic 
genes and further analyzed by one -pass sequencing and 
Northern hybridization. 

Random primed PGR (RP-PCR) involves 
synthesis of large quantities of cDNA probes from a 
trace amount of cDNA template. The method combines 
the amplification power of PGR with the representation 
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of random priming to simultaneously amplify and label 
double -stranded cDNA in a single tube reaction. 

Methods considered useful in obtaining 
oleosin genomic recombinant DNA are provided in 
Sambrook et al. 1989, in Molecular Cloning: A 
Laboratory Manual. Cold Spring Harbor. NY, for 
example, or any of the myriad of laboratory manuals on 
recombinant DNA technology that are widely available. 
To determine nucleotide sequences, a multitude of 
techniques are available and known to the ordinarily 
skilled artisan. For example, restriction fragments 
containing an oleosin regulatory region can be 
subcloned into the polyl inker site of a sequencing 
vector such as pBluescript (Stratagene) . These 
pBluescript subclones can then be sequenced by the 
double -stranded dideoxy method (Chen and Seeburg, 

1985 , DNA 4 : 165) . 

In a preferred embodiment, the oleosin 
regulatory region comprises nucleotides 1-1267 of Fig. 
9 (SEQ ID NO:12). Modifications to the oleosin 
regulatory region as set forth in SEQ ID NO: 12 which 
maintain the characteristic property of directing 
seed- specif ic expression, are within the scope of the 
present invention. Such modifications include 
insertions, deletions and substitutions of one or more 

nucleotides . 

The 5 ' regulatory region of the present 
invention can be derived from restriction endonuclease 
or exonuclease digestion of an oleosin genomic clone. 
Thus, for example, the known nucleotide or amino acid 
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sequence of the coding region of an isolated oleosin 
gene (e.g. Fig. 7) is aligned to the nucleic acid or 
deduced amino acid sequence of an isolated oleosin 
genomic clone and 5- flanking sequence (i.e., sequence 
upstream from the translational start codon of the 
coding region) of the isolated oleosin genomic clone 
located. 

The oleosin 5 ' regulatory region as set 
forth in SEQ ID NO: 12 (nucleotides 1-1267 of Fig. 9) 
may be generated from a genomic clone having either or 
both excess 5' flanking sequence or coding sequence by 
exonuclease lll-mediated deletion. This is 
accomplished by digesting appropriately prepared DNA 
with exonuclease III (exoIII) and removing aliquots at 
increasing intervals of time during the digestion. 
The resulting successively smaller fragments of DNA 
may be sequenced to determine the exact endpoint of 
the •deletions. There are several commercially 
available systems which use exonuclease III (exoIII) 
to create such a deletion series, e.g. Promega 
Biotech, "Erase-A-Base" system. Alternatively, PGR 
primers can be defined to allow direct amplification 
of the subject 5' regulatory regions. 

Using the same methodologies, the 
ordinarily skilled artisan can generate one or more 
deletion fragments of nucleotides 1-1267 as set forth 
in SEQ ID NO: 12. Any and all deletion fragments which 
comprise a contiguous portion of nucleotides set forth 
in SEQ ID NO: 12 and which retain the capacity to 
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direct seed- specific expression are contemplated by 

the present invention. 

The identification of oleosin 5' regulatory 
sequences which direct seed- specif ic expression 
comprising nucleotides 1-1267 of SEQ ID NO:12 and 
modifications or deletion fragments thereof, can be 
accomplished by transcriptional fusions of specific 
sequences with the coding sequences of a heterologous 
gene, transfer of the chimeric gene into an 
appropriate host, and detection of the expression of 
the heterologous gene. The assay used to detect 
expression depends upon the nature of the heterologous 
sequence. For example, reporter genes, exemplified by 
chloramphenicol acetyl transferase and |3 -glucuronidase 
(GUS) , are commonly used to assess transcriptional and 
translational competence of chimeric constructions. 
Standard assays are available to sensitively detect 
the reporter enzyme in a transgenic organism. The p- 
glucuronidase (GUS) gene is useful as a reporter of 
promoter activity in transgenic plants because of the 
high stability of the enzyme in plant cells, the lack 
of intrinsic p - glucuronidase activity in higher plants 
and availability of a quantitative fluorimetric assay 
and a histochemical localization technique. Jefferson 
et al. (1987 EMBO J 5:3901) have established standard 
procedures for biochemical and histochemical detection 
of GUS activity in plant tissues. Biochemical assays 
are performed by mixing plant tissue lysates with 4- 
methylurabelliferyl-3-D-glucuronide, a fluorimetric 
substrate for GUS, incubating one hour at 37 °C, and 
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then measuring the fluorescence of the resulting 4- 
methyl-umbellif erone. Histochemical localization for 
GUS activity is determined by incubating plant tissue 
samples in 5-bromo-4-chloro-3-indolyl-glucuronide (X- 
Glue) for about 18 hours at 37 °C and observing the 
staining pattern of X-Gluc. The construction of such 
chimeric genes allows definition of specific 
regulatory sequences and demonstrates that these 
sequences can direct expression of heterologous genes 
in a seed- specif ic manner. 

Another aspect of the invention is directed 
to expression cassettes and expression vectors (also 
termed herein "chimeric genes") comprising a 5' 
regulatory region from an oleosin gene which directs 
seed specific expression operably linked to the coding 
sequence of a heterologous gene such that the 
regulatory element is capable of controlling 
expression of the product encoded by the heterologous 
gene. The heterologous gene can be any gene other 
than oleosin. If necessary, additional regulatory 
elements or parts of these elements sufficient to 
cause expression resulting in production of an 
effective amount of the polypeptide encoded by the 
heterologous gene are included in the chimeric 
constructs . 

Accordingly, the present invention provides 
chimeric genes comprising sequences of the oleosin 5' 
regulatory region that confer seed- specif ic expression 
which are operably linked to a sequence encoding a 
heterologous gene such as a lipid metabolism enzyme. 
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Examples of lipid metabolism genes useful for 
practicing the present invention include lipid 
desaturases such as A6-desaturases, Al2-desaturases, 
Al5-desaturases and other related desaturases such as 
stearoyl-ACP desaturases, acyl carrier proteins 
(ACPs) , thioesterases. acetyl transacylases , acetyl - 
coA carboxylases, ketoacyl - synthases , malonyl 
transacylases, and elongases. Such lipid metabolism 
genes have been isolated and characterized from a 
number of different bacteria and plant species. Their 
nucleotide coding secjuences as well as methods of 
isolating such coding sequences are disclosed in the 
published literature and are widely available to those 

of skill in the art. 

In particular, the A6 - desaturase genes 
disclosed in U.S. Patent No. 5,552,306 and 
applicants' copending U.S. Application Serial No. 
08/366,779 filed December 30, 1994 and incorporated 
herein by reference, are contemplated as lipid 
metabolism genes particularly useful in the practice 
of the present invention. 

The chimeric genes of the present invention 
are constructed by ligating a 5' regulatory region of 
a oleosin genomic DNA to the coding sequence of a 
heterologous gene. The juxtaposition of these 
sequences can be accomplished in a variety of ways, 
in a preferred embodiment the order of the sequences, 
from 5' to 3', is an oleosin 5' regulatory region 
(including a promoter) , a coding sequence, and a 
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termination sequence which includes a polyadenylation 



site . 



standard techniques for construction of such 
chimeric genes are well known to those of ordinary 
skill in the art and can be found in references such 
as sambrook et al . (1989) . A variety of strategies are 
available for ligating fragments of DNA, the choice of 
which depends on the nature of the termini of the DNA 
fragments. One of ordinary skill in the art 
recognizes that in order for the heterologous gene to 
be expressed, the construction requires promoter 
elements and signals for efficient polyadenylation of 
the transcript. Accordingly, the oleosin 5' 
regulatory region that contains the consensus promoter 
sequence known as the TATA box can be ligated directly 
to a promoterless heterologous coding sequence. 

The restriction or deletion fragments that 
contain the oleosin TATA box are ligated in a forward 
orientation to a promoterless heterologous gene such 
as the coding sequence of (5 -glucuronidase (GUS) . The 
skilled artisan will recognize that the subject 
oleosin 5- regulatory regions can be provided by other 
means, for example chemical or enzymatic synthesis. 
The 3' end of a heterologous coding sequence is 
optionally ligated to a termination sequence 
comprising a polyadenylation site, exemplified by, but 
not limited to, the nopaline synthase polyadenylation 
site or the octopine T-DNA gene 7 polyadenylation 
site'. Alternatively, the polyadenylation site can be 
provided by the heterologous gene. 
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The present invention also provides methods 
of increasing levels of heterologous genes in plant 
seeds. In accordance with such methods, the subject 
expression cassettes and expression vectors are 
introduced into a plant in order to effect expression 
of a heterologous gene. For example, a method of 
producing a plant with increased levels of a product 
of a fatty acid synthesis or lipid metabolism gene is 
provided by transforming a plant cell with an 
expression vector comprising an oleosin 5' regulatory 
region operably linked to a fatty acid synthesis or 
lipid metabolism gene and regenerating a plant with 
increased levels of the product of said fatty acid 
synthesis or lipid metabolism gene. 

Another aspect of the present invention 
provides methods of reducing levels of a product of a 
gene which is native to a plant which comprises 
transforming a plant cell with an expression vector 
comprising a subject oleosin regulatory region 
operably linked to a nucleic acid sequence which is 
complementary to the native plant gene. In this 
manner, levels of endogenous product of the native 
plant gene are reduced through the mechanism known as 
antisense regulation. Thus, for example, levels of a 
product of a fatty acid synthesis gene or lipid 
metabolism gene are reduced by transforming a plant 
with an expression vector comprising a subject oleosin 
5' regulatory region operably linked to a nucleic acid 
sequence which is complementary to a nucleic acid 
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sequence coding for a native fatty acid synthesis or 
lipid metabolism gene. 

The present invention also provides a method 
of cosuppressing a gene which is native to a plant 
which comprises transforming a plant cell with an 
expression vector comprising a subject oleosin 5* 
regulatory region operably linked to a nucleic acid 
sequence coding for the native plant gene. In this 
manner, levels of endogenous product of the native 
plant gene are reduced through the mechanism known as 
cosuppression. Thus, for example, levels of a product 
of a fatty acid synthesis gene or lipid metabolism 
gene are reduced by transforming a plant with an 
expression vector comprising a subject oleosin 5* 
regulatory region operably linked to a nucleic acid 
sequence coding for a native fatty acid synthesis or 
lipid metabolism gene native to the plant. Although 
the -exact mechanism of cosuppression is not completely 
understood, one skilled in the art is familiar with 
published works reporting the experimental conditions 
and results associated with cosuppression (Napoli et 
al- 1990 The Plant Cell 2:270-289; Van der Krol 1990 
The Plant Cell 2:291-299. 

To provide regulated expression of the 
heterologous or native genes, plants are transformed 
with the chimeric gene constructions of the invention. 
Methods of gene transfer are well known in the art. 
The chimeric genes can be introduced into plants by 
leaf disk transformation- regeneration procedure as 
described by Horsch et al . 1985 Science 227:1229. 
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Other methods of transformation such as protoplast 
culture (Horsch et al . 1984 Science 223:496, DeBlock 
et al. 1984 EMBO J. 2:2143, Barton et al . 1983, Cell 
32:1033) can also be used and are within the scope of 
this invention. In a preferred embodiment, plants are 
transformed with Agrojbacteriu/n-derived vectors such as 
those described in Klett et al . (1987) Annu. Rev. 
Plant Physiol. 38:461. Other well-known methods are 
available to insert the chimeric genes of the present 
invention into plant cells. Such alternative methods 
include biolistic approaches (Klein et al . 1987 Nature 
327:70), electroporation, chemically- induced DNA 
uptake, and use of viruses or pollen as vectors. 

When necessary for the transformation 
method, the chimeric genes of the present invention 
can be inserted into a plant transformation vector, 
e.g. the binary vector described by Sevan, M. 1984 
Nucleic Acids Res. 12:8711-8721. Plant transformation 
vectors can be derived by modifying the natural gene 
transfer system of Agrobacterium tumefaciens. The 
natural system comprises large Ti (tumor- inducing) - 
plasmids containing a large segment, known as T-DNA, 
which is transferred to transformed plants. Another 
segment of the Ti plasmid, the vir region, is 
responsible for T-DNA transfer. The T-DNA region is 
bordered by terminal repeats. In the modified binary 
vectors, the tumor inducing genes have been deleted 
and the functions of the vir region are utilized to 
transfer foreign DNA bordered by the T-DNA border 
sequences. The T-region also contains a selectable 
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marker for antibiotic resistance, and a multxple 
cloning site for inserting sequences for transfer, 
such engineered strains are known as "disarmed" A 
tumef.ciens strains, and allow the efficient transfer 
of sequences bordered by the T- region into the nuclear 

genome of plants. 

Surface -sterilized leaf disks and other 
susceptible tissues are inoculated with the "disarmed" 
foreign DNA- containing A. tumefaciens. cultured for a 
number of days, and then transferred to antibiotic - 
containing medium. Transformed shoots are then 
selected after rooting in medium containing the 
appropriate antibiotic, and transferred to soil. 
Transgenic plants are pollinated and seeds from these 
plants are collected and grown on antibiotic medium. 

Expression of a heterologous or reporter 
gene in developing seeds, young seedlings and mature 
plants can be monitored by immunological, 
histochemical or activity assays. As discussed 
herein, the choice of an assay for expression of the 
chimeric gene depends upon the nature of the 
heterologous coding region. For example. Northern 
analysis can be used to assess transcription if 
appropriate nucleotide probes are available. If 
antibodies to the polypeptide encoded by the 
heterologous gene are available. Western analysis and 
immunohistochemical localization can be used to assess 
the production and localization of the polypeptide. 
Depending upon the heterologous gene, appropriate 
biochemical assays can be used. For example. 
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acetyltransf erases are detected by ™easurin, 

!fvlatioi> of a standard substrate. The expression 
UpirdLaturase ,ene can be assayed by anaXys.s 
of fatty acid methyl esters (FWIES) . 

Another aspect of the present invention 
provides transgenic plants or progeny of these plants 
containing the chimeric genes of the ^--^-^ 
monocotyledonous and dicotyledonous plants are 
contemplated. Plant cells are J^"^^^''" 
chimeric genes by any of the plant 

methods described above. The --"^^^^^"^.^f 
usually in the form of a callus ^^f^^f/,^„ 

explant or whole Plant '-"^^^^^'nld So Pr^L. 
method of Bechtold et al. 1993 C.R. Acaa. 
3I6-1194-1199) is regenerated into a complete 
tra;sgeni= plant by methods „ell-lcnown to one of 
ordinary slcill in the art (e.g. Horsch et al 1985 
science 227:1129) . In a preferred embodiment, the 
:::: :enic Pl-nt is sunflower, cotton, oil seed rape 
maize, tobacco, Arabldopsls. peanut or soybean. Sxnce 
progeny of transformed plants inherit the chrmeric 
genls seeds or cuttings from transformed plants are 
used to maintain the transgenic line. 

The following examples further illustrate 

the invention. 
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EXAMPLE X 

Isolation of Meiobreuie - Bound Polysomal 
RNA and Construction of Borage cDNA Lxbrary 

Membrane -bound polysomes were isolated from 
borage seeds 12 days post pollination (12 DPP) using 
the protocol established for peas by Larkins and 
Davies (1975 Plant Phys. 55: 749-756). RNA was 
extracted from the polysomes as described by Mechler 
(19 87 Methods in Enzymology 152: 241-248, Academic 
Press) . Poly-A* RNA was isolated from the membrane 
bound polysomal RNA using Oligotex-dT™ beads (Qiagen) . 

Corresponding cDNA was made using 
Stratagene's ZAP cDNA synthesis kit. The cDNA library 
was constructed in the lambda ZAP II vector 
(Stratagene) using the lambda ZAP II kit. The primary 
library was packaged with Gigapack II Gold packaging 
extract (Stratagene) . 
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Isolation of a A- 6 Desaturase cDNA from Borage 

H^/hT-idi^ aMon protocol 

The amplified borage cDNA library was plated 

at low density (500 pfu on 150 mm petri dishes) . 

Highly prevalent seed storage protein cDNAs were 

reduced (subtracted from the total cDNAs) by screening 

with the corresponding cDNAs. 

Hybridization probes for screening the 

borage cDNA library were generated by using random 

primed DNA synthesis as described by Ausubel fit al 
(1994 niT-T-^ni-. P i-z^t-oor^i fi in Molecular BiQlPgV / Wiley 

interscience, N.Y.) and corresponded to previously 
identified abundantly expressed seed storage protein 

cDNAs. Unincorporated nucleotides were removed by use 

of a G-50 spin column (Boehringer Manheim) . Probe was 
dena-tured for hybridization by boiling in a water bath 
for 5 minutes, then quickly cooled on ice. 
Nitrocellulose filters carrying fixed recombinant 
bacteriophage were prehybridized at 60°C for 2-4 hours 
in hybridization solution [4X SET (600 mM NaCl, 80 mM 
Tris-HCl, 4 mM Na^EDTA; pH 7.8), 5X Denhardt's reagent 
(0.1% bovine serum albumin, 0.1% Ficoll. and 0.1% 
polyvinylpyrolidone) , 100 pg/ml denatured salmon sperm 
DNA, 50 vig/ml polyadenine and 10 ug/ml polycytidine] . 
This was replaced with fresh hybridization solution to 
which denatured radioactive probe (2 ng/ml 
hybridization solution) was added. The filters were 
incubated at 60°C with agitation overnight. Filters 
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were washed sequentially in 4X, 2X, and IX SET (150 mM 
NaCl, 20 mM Tris-HCl. 1 mM Na^EDTA; pH7.8) for 15 
minutes each at 60*C. Filters were air dried and then 
exposed to X-ray film for 24 hours with intensifying 

screens at -BO^C. 

Non-hybridizing plaques were excised using 
Stratagene's excision protocol and reagents. 
Resulting bacterial colonies were used to inoculate 
liquid cultures and were either sequenced manually or 
by an ABI automated sequencer. 

p^r^rlnm .qa ^„or.r^i nrr of CDNAS from a BgCaqP ^ ? ' (PPP> 

Momhiran e - Bound Polvsomal l^X^^X^XY 

Each cDNA corresponding to a non- 

hybridizing plaque was sequenced once and a sequence 

tag generated from 200-300 base pairs. All sequencing 

was performed by cycle sequencing (Epicentre) . Over 

300 expressed sequence tags (ESTs) were generated. 

Each" sequence tag was compared to the GenBank database 

using the BLAST algorithm (Altschul et al . 1990 J". 

Mol. Biol. 215;403-410) . A number of lipid metabolism 

genes, including the A6 -desaturase were identified. 

Database searches with the cDNA clone 

designated mbp-65 using BUASTX with the GenBank 

database resulted in a significant match to the 

previously isolated Synechocystis A6- desaturase. It 

was determined however, that mbp-65 was not a full 

length cDNA. A full length cDNA was isolated using 

mbp-65 to screen the borage membrane -bound polysomal 

library. The resultant clone was designated pANl and 

the CDNA insert of pANl was sequenced by the cycle 
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sequencing method. The amino acid sequence deduced 
from the open reading frame {Fig. 1, SEQ ID NO:l) was 
compared to other known desaturases using Geneworks 
( Intel ligGenetics) protein alignment program. This 
alignment indicated that the cDNA insert of pANl was 
the borage A6-desaturase gene. 

The resulting dendrogram (Figure 2) shows 
that a"- desaturases and a'^- desaturases comprise two 
groups. The newly isolated borage sequence and the 
previously isolated Synechocystis A'-desaturase (U.S. 
Patent No. 5.552,306) formed a third distinct group. 
A comparison of amino acid motifs common to 
desaturases and thought to be involved catalytically 
in metal binding illustrates the overall similarity of 
the protein encoded by the borage gene to desaturases 
in general and the Syneciiocystis A^-desaturase in 
particular (Table 1) . At the same time, comparison of 
the motifs in Table 1 indicates definite differences 
between this protein and other plant desaturases. 
Furthermore, the borage sequence is also distinguished 
from known plant membrane associated fatty acid 
desaturases by the presence of a heme binding motif 
conserved in cytochrome bj proteins (Schmidt et al . 
1994 Plant Mol . Biol. 25;631-642) (Figure 1). Thus, 
while these results clearly suggested that the 
isolated cDNA was a borage A^-desaturase gene, further 
confirmation was necessary. To confirm the identity 
of the borage A6 -desaturase cDNA, the cDNA insert from 
pANl was cloned into an expression cassette for stable 
expression. The vector pBI121 (Jefferson et al . 1987 
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EMBO J. 5.:3901-3907) was prepared for ligation by 
digestion with BamHI and EcoICR I (an isoschizomer of 
sad which leaves blunt ends; available from Promega) 
which excises the GUS coding region leaving the 35S 
promoter and NOS terminator intact. The borage 
desaturase cDNA was excised from the recombinant 
plasmid (pANl) by digestion with BamHI and Xhol . The 
Xhol end was made blunt by performing a fill-in 
reaction catalyzed by the Klenow fragment of DNA 
polymerase I. This fragment was then cloned into the 
BamHI/EcoICR I sites of pBI121.1. resulting in the 
plasmid pAN2 . 
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BXAMPLE 3 

Production of Transgenic 
Plants and Preparation and . 
Analysis of Fatty Acid Methyl Esters (PAMBs) 



The expression plasmid, pAN2 was used to 
transform tobacco (Nicotiana tabacum cv. xanthi) via 
Agrobacterium tumefaciens according to standard 
procedures (Horsch. et al. 1985 Science 227:1229-1231; 
Bogue et al. 1990 Mol. Gen. Genet. 221:49-51) except 
that the initial trans formants were selected on 100 

/xg/ml kanamycin. 

Tissue from transgenic plants was frozen in 
liquid nitrogen and lyophilized overnight. FAMEs were 
prepared as described by Dahmer, et al . (1989) J. 
Amer. Oil. Chem. Soc. 66: 543-548. In some cases, the 
solvent was evaporated again, and the FAMEs were 
resuspended in ethyl acetate and extracted once with 
deionized water to remove any water soluble 
contaminants. FAMEs were analyzed using a Tracor-560 
gas liquid chroma tograph as previously described 

(Reddy et al . 1996 Nature Biotech. 14:629-6^2) . 

AS shown in Figure. 3, transgenic tobacco 

leaves containing the borage cDNA produced both GLA 

and octadecatetraenoic acid (OTA) (18:4 A6.9,12,15). 

These results thus demonstrate that the isolated cDNA 

encodes a borage a6 -desaturase. 
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EXAMPLE 4 

Expression of A6-desaturase in Borage 

The native expression of A6 -desaturase was 
examined by Northern Analysis of RNA derived from 
borage tissues. RNA was isolated from developing 
borage embryos following the method of Chang et al . 
1993 Plant Mol . Biol. Rep. 11:113-116. RNA was 
elect rophoretically separated on formaldehyde -agarose 
gels, blotted to nylon membranes by capillary 
transfer, ^nd immobilized by baking at SO^C for 30 
minutes following standard protocols (Brown T., 1996 
in Current Protocols in Molecular Biology, eds. 
Auselbel, et al . [Greene Publishing and Wiley- 
Interscience, New York] pp. 4.9.1-4.9.14.). The 
filters were preincubated at 42°C in a solution 
containing 50% deionized formamide, 5X Denhardt's 
reagent, 5X SSPE (900 mM NaCl; S.OmM Sodium phosphate, 
PH7.7; and 5 mM EDTA) , 0.1% SDS, and 2 00 ug/ml 
denatured salmon sperm DNA. After two hours, the 
filters were added to a fresh solution of the same 
composition with the addition of denatured radioactive 
hybridization probe. In this instance, the probes 
used were borage legumin cDNA (Fig. 16A) , borage 
oleosin cDNA (Fig. 16B) , and borage A6 -desaturase cDNA 
(pANl, Example 2) (Fig. 16C) . The borage legumin and 
oleosin cDNAs were isolated by EST cloning and 
identified by comparison to the GenBank database using 
the BLAST algorithm as described in Example 2. 
Loading variation was corrected by normalizing to 



wo 98/45461 



PCTAJS98/07179 



-34- 



levels of borage EFla mRNA. EFla mRNA was identified 
by correlating to the corresponding cDNA obtained by 
the EST analysis described in Example 2. The filters 
were hybridized at 42 for 12-20 hours, then washed 
as described above (except that the temperature was 
65°C) , air dried, and exposed to X-ray film. 

As depicted in Figs. 15A and 15B, A6- 
desaturase is expressed primarily in borage seed. 
Borage seeds reach maturation between 18-20 days post 
pollination (dpp) . A6 - desaturase mRNA expression 
occurs throughout the time points collected (8-20 
dpp), but appears maximal from 10-16 days post 
pollination. This expression profile is similar to 
that seen for borage oleosin and 12S seed storage 
protein mRNAs (Figs. 16A, 16B, and 16C) . 
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isolation and Characterization of a Novel Oleosin cDNA 

The oleosin cDNA (AtS21) was isolated by 
virtual subtraction screening of an Arabidopsia 
developing seed cDNA library using a random primed 
polymerase chain reaction (RP-PCR) cDNA probe derived 
from root tissue. 

pWA PREPARATION 

Arabidopsis thaliana Landsberg erecta plants 
were grown under continuous illumination in a 
vermiculite/soil mixture at ambient temperature 
(22''C) . Siliques 2-5 days after flowering were 
dissected to separately collect developing seeds and 
silique coats. Inflorescences containing initial 
flower buds and fully opened flowers, leaves, and 
whole siliques one or three days after flowering were 
also collected. Roots were obtained from seedlings 
that had been grown in Gamborg B5 liquid mediiam (GIBCO 
BRI-) for two weeks. The seeds for root culture were 
previously sterilized with 50% bleach for five minutes 
and rinsed with water extensively. All tissues were 
frozen in liquid nitrogen and stored at -80°C until 
use. Total RNAs were isolated following a hot 
phenol/SDS extraction and LiCl precipitation protocol 
(Harris et al . 1978 Blochem. 17:3251-3256; Galau et 
al. 1981 J-. Biol. Chem. 255:2551-2560). Poly A+ RNA 
was isolated using oligo dT column chromatography 
according to manufacturers' protocols (PHARMACIA or 
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STRATAGENE) or using oligotex-dT latex particles 
(QIAGEN) . 

Flower, one day silique, three day silique, 
leaf, root, and developing seed cDNA libraries were 
each constructed from 5 ug poly A+ RN using the ZAP 
CDNA synthesis kit (Stratagene) . cDNAs were 
directionally cloned into the EcoRI and Xhol sites of 
pBluescript SK(-) in the X-ZAPII vector (Short et al . 
1988 Nucleic Acids Res. 16:7583-7600). Nonrecombinant 
phage plaques were identified by blue color 
development on NZY plates containing X-gal (5 bromo-4- 
chloro-3-indoyl-|3-D-galactopyranoside) and IPTG 
(isopropyl-l-thio-p-D-galactopyranoside) . The 
nonrecombinant backgrounds for the flower, one day 
silique. three day silique. leaf, root, and developing 
seed- cDNA libraries were 2.8%, 2%m 3.3%, 6.5%, 2.5%. 
and 1.9% respectively. 

panHom r»T-imina PNA Isabel i nq 

The cDNA inserts of isolated clones 
(unhybridized cDNAs) were excised by EcoRl/Xhol double 
digestion and gel -purified for random priming 
labeling. Klenow reaction mixture contained 50 ng DNA 
templates, 10 mM Tris-HCl, pH 7.5. 5 mM MgCla. 7 . 5 inM 
DTT, 50 uM each of dCTP, dGTP, and dTTP, 10 uM hexamer 
random primbers (Boehringer Mannheim), 50 yCi a- 32 P- 
dATP, 3000 Ci/mmole, 10 mCi/ml (DuPont) , and 5 units 
of DNA polymerase I Klenow fragment (New England 
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Biolabs) . The reactions were carried out at 37°C for 
one hour. Aliquots of diluted reaction mixtures were 
used for TCA precipitation and alkaline denaturing gel 
analysis. Hybridization probes were labeled only with 
Klenow DNA polymerase and the unincorporated dNTPs 
were removed using Sephadex R G-50 spin columns 
(Boehringer Mannheim) . 

panrlom Pri mfid PGR 

Double- stranded cDNA was synthesized from 
poly AH- RNA isolated from Arabidopsis root tissue 
using the cDNA Synthesis System (GIBCO BRL) with oligo 
dT12-18 as primers. cDNAs longer than 300 bp were 
enriched by Sephacryl S-400 column chromatography 
(Stratagene) . Fractionated cDNAs were used as 
templates for RP-PCR labeling. The reaction contained 
10 mM Tris-HCl, ph 9 . 0 , 50 mM KCl, 0.1% Triton X-100. 
2 mM. MgC12, 5 units Tag DNA polymeras (PROMEGA) , 200 
ViM dCTP. cGTP, and dTTP, and different concentrations 
of hexamer random primers a-32P dATP, 800 mCi/mmole, 
10 mCi/ml (DuPont) , and cold dATP in a final volume of 
25 ul. After an initial 5 minutes at 95°C, different 
reactions were run through different programs to 
optimize RP-PCR cDNA conditions. Unless otherwise 
indicated, the following program was used for most RP- 
PCR cDNA probe labeling: 95°C/5 minutes, then 40 
cycles of 95°C 30 seconds, 18»C/1 second, ramp to 30*0 
at a rate of 0 . l°C/second. 72''C/1 minute. RP-PCR 
products were phenol /chloroform extracted and ethanol 



wo 98/45461 



-38- 



PCTAJS98/07179 



precipitated or purified by passing through Sephadex 
G-50 spin columns (Boehringer Mannheim). 

rion^ bi ^f ^^-iT-t-nai wubtractigQ 

Mass excision of A- ZAP cDNA libraries was 
carried out by co- infecting XLl-Blue MRF' host cells 
with recombinant phage from the libraries and ExAssist 
helper phage (STRATAGENE) . Excised phagemids were 
rescued by SOLR cells. Plasmid DNAs were prepared by 
boiling mini -prep method (Holmes et al . 1981 Anal. 
Biochem. 114:193-197) from randomly isolated clones. 
cDNA inserts were excised by EcoRI and Xhol double 
digestion, and resolved on 1% agarose gels. The DNAs 
were denatured in 0.5 N NaOH and 1.5 m NaCl for 45 
minutes, neutralized in 0.5 M Tris-HCl, pH 8.0, and 
1.5 M NaCl for 45 minutes, and then transferred by 
blotting to nylon membranes (Micron Separations, Inc.) 
in lOX SSC overnight. After one hour prehybridization 
at es^C, root RP-cDNA probe was added to the same 
hybridization buffer containing 1% bovine albumin 
fraction V (Sigma), 1 mM EDTA, 0.5 M NaHP04, pH 7.2, 
7% SDS. The hybridization continued for 24 hours at 
es'C. The filters were washed in 0.5% bovine albxamin, 
1 mM EDTA, 40 mM NaHP04 , pH 7 . 2 , 5% SDS for ten 
minutes at room temperature, and 3 x 10 minutes in 1 
mM EDTA, 40 mM NaHP04 , pH 7.2, 1% SDS at 65''C. 
Autoradiographs were exposed to X-ray films (Kodak) 
for two to five days at -SO^C. 

Hybridization of resulting blots with root 
RP-PCR probes "virtually subtracted- seed cDNAs shared 
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with the root mRNA population. The remaining seed 
cDNAs representing putative seed- specif ic cDNAs, 
including those encoding oleosins, were sequenced by 
the cycle sequencing method, thereby identifying AtS21 
as an oleosin cDNA clone. 

citar piP^ncp analvsiB of AtS23. 

The oleosin cDNA is 834 bp long including an 
18 bp long poly A tail (Fig. 4, SEQ ID NO:2) It has 
high homology to other oleosin genes from Arabidopsis 
as well as from other species. Recently, an identical 
oleosin gene has been reported (Zou, et al . , 1996, 
Plant Mol.'Biol. 31:429-433). The predicted protein is 
191 amino acids long with a highly hydrophobic middle 
domain flanked by a hydrophilic domain on each side. 
The existence of two upstream in frame stop codons and 
the similarity to other oleosin genes indicate that 
this' CDNA is full-length. Since there are two in frame 
stop codons just upstream of the first ATG, this cDNA 
is considered to be a full length cDNA (Figure 4, SEQ 
ID NO:2). The predicted protein has three distinctive 
domains based on the distribution of its amino acid 
residues. Both the N- terminal and C- terminal domains 
are rich in charged residues while the central domain 
is absolutely hydrophobic (Figure 5) . As many as 20 
leucine residues are located in the central domain and 
arranged as repeats with one leucine occurring every 
7-10 residues. Other non-polar amino acid residues 
are also clustered in the central domain making this 
domain absolutely hydrophobic (Figure 6) . 
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Extensive searches of different databases 
using both AtS21 cDNA and its predicted protein 
sequence identified oleosins from carrot, maize, 
cotton, rapeseed, Arabidopsis, and other plant 
species. The homology is mainly restricted to the 
central hydrophobic domain. Seven Arabidopsis oleosin 
sequences were found. AtS21 represents the same gene 
as Z54164 which has a few more bases in the 5' 
untranslated region. The seven Arabidopsis oleosm 
sequences available so far were aligned to each other 
(Figure 7) . The result suggested that the seven 
sequences fall into three groups. The first group 
includes AtS21 (SEQ ID NO:5). X91918 (SEQ ID NO:6), 
and the partial sequence Z29859 (SEQ ID N0:7) . Since 
X91918 <SEQ ID NO: 6) has only its last residue 
different from AtS21 (SEQ ID NO:5), and since Z29859 
(SEQ ID NO: 7) has only three amino acid residues which 
are .different from AtS21 (SEQ ID N0:5), all three 
sequences likely represent the same gene. The two 
sequences of the second group, X62352 (SEQ ID NO: 8) 
and Atol3 (SEQ ID NO:9), are different in both 
sequence and length. Thus, there is no doubt that 
they represent two independent genes. Like the first 
group, the two sequences of the third group, X91956 
(SEQ ID NO:10) and L40954 (SEQ ID NO:ll), also have 
only three divergent residues which may be due to 
sequence errors. Thus, X91956 (SEQ ID NO:10) and 
L40954 (SEQ ID NO:ll) likely represent the same gene, 
unlike all the other olecsin sequences which were 
predicted from cDNA sequences, X62352 (SEQ ID NO: 8) 
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was deduced from a genomic sequence (Van Rooigen et 
al. 1992 Plant Mol. Biol. 18:1177-1179). In 
conclusion, four different Arabidopsis oleosin genes 
have been identified so far, and they are conserved 
only in the middle of the hydrophobic domain. 

]y7/^-r<-hft-rn AnalvsiP 

In order to characterize the expression 
pattern of the native AtS21 gene. Northern analysis 
was performed as described in Example 4 except that 
the probe was the AtS21 cDNA (pANl insert) labeled 
with "P-dATP to a specific activity of 5 x 10« cpm/ug. 

Results indicated that the AtS21 gene is 
strongly expressed in developing seeds and weakly 
expressed in silique coats (Figure 8A) . A much larger 
transcript, which might represent unprocessed AtS21 
pre-mRNA, was also detected in developing seed RNA. 
AtS2a was not detected in flower, leaf, root (Figure 
8A), or one day silique RNAs. A different Northern 
analysis revealed that AtS21 is also strongly 
expressed in imbibed germinating seeds (Figs. 13A and 
13B) 
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Characterization of Oleosin 
Genomic Clones and Isolation of Oleosin Promoter 

Genomic clones were isolated by screening an 
Arabidopsis genomic DNA library using the full length 
cDNA (AtS21)as a probe. Two genomic clones were 
mapped by restriction enzyme digestion followed by 
southern hybridization using the 5' half of the cDNA 
cleaved by Sad as a probe. A 2 kb Sacl fragment was 
subcloned and sequenced (Fig. 9. SEQ ID NO:35). Two 
regions of the genomic clone are identical to the cDNA 
sequence. A 395 bp intron separates the two regions. 

.The copy number of AtS21 gene in the 
Arabidopsis genome was determined by genomic DNA 
Southern hybridization following digestion with the 
enzymes BamHI, EcoRI , Hindlll, Sad and Xbal, using 
the full length cDNA as a probe (Figure 8B) . A single 
band' was detected in all the lanes except Sad 
digestion where two bands were detected. Since the 
cDNA probe has an internal Sad site, these results 
indicated that AtS21 is a single copy gene in the 
Arabidopsis genome. Since it has been known that 
Arabidopsis genome contains different isoforms of 
oleosin genes, this Southern analysis also 
demonstrates that the different oleosin isoforms of 
Arabidopsis are divergent at the DNA sequence level. 

Two regions, separated by a 395 bp intron, 
of the genomic DNA fragment are identical to AtS21 
CDNA sequence. Database searches using the 5' 
promoter sequence upstream of AtS21 cDNA sequence did 
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not identify any sequence with significant homology. 
Furthermore, the comparison of AtS21 promoter sequence 
with another Arabidopsis oleosin promoter isolated 
previously ( Van Rooijen. et al . , 1992) revealed 
little similarity. The AtS21 promoter sequence is 
rich in A/T bases, and contains as many as 44 direct 
repeats ranging from 10 bp to 14 bp with only one 
mismatch allowed. Two 14 bp direct repeats, and a 
putative ABA response element are underlined in Figure 
9 . 
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EXAMPLE 7 

ConstiTuction of AtS21 
Promoter/GUS Gene Expression Cassette and Expression 
Patterns in Transgenic Arabidopsis and Tobacco 

ronc^hruct -ion r>f At<=i9A promoter /GUS gpne expression 

cassette . 

The 12 67 bp promoter fragment starting from 

the first G upstream of the ATG codon of the genomic 

DNA fragment was amplified using PGR and fused to the 

GUS reporter gene for analysis of its activity. 

The promoter fragment of the AtS21 genomic clone was 

amplified by PGR using the T7 primer 

GTAATACGACTCACTATAGGGC (SEQ ID NO: 13) and the 2 IP 

primer GGGGATCCTATACTAAAACTATAGAGTAAAGG (SEQ ID NO: 14) 

complementary to the 5» untranslated region upstream 

of the first ATG codori (Figure 9) . A BamHI cloning 

site was introduced by the 21P primer. The amplified 

fragment was cloned into the BamHI and Sad sites of 

pBluescript KS (Stratagene) . Individual clones were 

sequenced to check possible PGR mutations as well as 

the orientation of their inserts. The correct clone 

was digested with BamHI and Hindlll, and the excised 

promoter fragment (1.3 kb) was cloned into the 

corresponding sites of pBIlOl.l (Jefferson, R.A. 

1987a, Plant Mol . Biol, Rep. 5:387-405; Jefferson et 

al., 1987b, EMBO J. 3901 - 3907 ) upstream of the GUS 

gene. The resultant plasmid was designated pAN5 (Fig. 

10) . The AtS21 promoter/GUS construct (pAN5) was 

introduced into both tobacco (by the leaf disc method, 

Horsch et al., 1985; Bogue et al . 1990 Afol . Gen. Gen. 
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221:49-57) and Arahidopsis Colombia ecotype via vacuum 
infiltration as described by Bechtold, et al . (1993) 
C.R. Acad. Sci. Paris, 315.- 1194 - 1199 • Seeds were 
sterilized and selected on media containing 50 pcg/ml 
kanamycin, 500 /zg/ml carbenicillin. 
CTTfi activity assav r Expression patterns of the 
reporter GUS gene were revealed by histochemical 
staining (Jefferson, et al . , 1987a, Plant Mol. Biol. 
Rep. 5:387-405) . Different tissues were stained in 
substrate solution containing 2 mg/ml 5-bromo-4- 
chloro-3-indolyl-3-D-glucuronic acid (X-Gluc) 
(Research Organics, Inc.), 0.5 mM potassium 
f errocyanide, and 0.5 mM potassium ferricyanide in 50 
mM sodium phosphate buffer, pH 7.0 at 37**C overnight, 
and then dehydrated successively in 20%, 40% and 80% 
ethanol (Jefferson, et al . , 1987). Photographs were 
taken using an Axiophot (Zeiss) compound microscope or 
Olympus SZHIO dissecting microscope. Slides were 
converted to digital images using a Spring/Scan 35LE 
slide scanner (Polaroid) and compiled using Adobe 
Photoshop™ 3.0.5 and Canvas™ 3.5. 

GUS activities were quantitatively measured 
by fluorometry using 2 mM 4 -MUG (4 -methylumbellif eryl - 
p-D-glucuronide) as substrate (Jefferson, et al . / 
19 87) . Developing Arahidopsis seeds were staged 
according to their colors, and other plant tissues 
were collected and kept at -80**C until use. Plant 
tissues were ground in extraction buffer containing 50 
mM sodium phosphate, pH 7.0, 10 mM EDTA, 10 mM 3- 
mercaptoethanol , 0.1% Triton X-100, and 0.1% sodium 
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lauryl sarcosine. The tissue debris was removed by 5 
minutes centrif ugation in a microfuge. The 
supernatant was aliquoted and mixed with substrate and 
incubated at 37 ''C for 1 hour. Three replicas were 
assayed for each sample. The reactions were stopped 
by adding 4 volumes of 0.2 M sodium carbonate. 
Fluorescence was read using a TKO-100 DNA fluorometer 
(Hoef er Scientific Instruments) . Protein 
concentrations of the extracts were determined by the 
Bradford method (Bio Rad) . 

Flxnressin n natterns of AtS21 promoter/GUS in 
tiransaen in Air^h.idojDSis and tobacco 

In Arahidopsis, GUS activity was detected in 
green seeds, and node regions where siliques, cauline 
leaves and branches join the inflorescence stem 
(Figures llA and IIB) . No GUS activity was detected 
in any leaf, root, flower, silique coat, or the 
internode regions of the inflorescence stem. Detailed 
studies of the GUS expression in developing seeds 
revealed that the AtS21 promoter was only active in 
green seeds in which the embryos had already developed 
beyond heart stage (Figures IIC and IIG) . The 
youngest embryos showing GUS activity that could be 
detected by histochemical staining were at early 
torpedo stage. Interestingly, the staining was only 
restricted to the lower part of the embryo including 
hypocotyl and embryonic radical . No staining was 
detected in the young cotyledons (Figures IID and 
HE) . Cotyledons began to be stained when the embryos 
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were at late torpedo or even early cotyledon stage 
(Figure IIF and IIH) . Later, the entire embryos were 
stained, and the staining became more intense as the 
embryos matured (Figures 111 and IIJ) . It was also 
observed that GUS gene expression was restricted to 
the embryos. Seed coat and young endosperm were not 
stained (Figure IIC) • 

GUS activity was also detected in developing 
seedlings. Young seedlings of 3-5 days old were 
stained everywhere. Although some root hairs close to 
the hypocotyl were stained (Figure IIK) , most of the 
newly formed structures such as root hairs, lateral 
root primordia and shoot apex were not stained 
(Figures IIL and UN) . Later, the staining was 
restricted to cotyledons and hypocotyls when lateral 
roots grew from the elongating embryonic root. The 
staining on embryonic roots disappeared. No staining 
was observed on newly formed lateral roots, true 
leaves nor trichomes on true leaves (Figures IIM and 
UN) . 

AtS21 promoter/GUS expression patterns in 
tobacco are basically the same as in Arahidopsis . GUS 
activity was only detected in late stage seeds and 
different node regions of mature plants. In 
germinating seeds, strong staining was detected 
throughout the entire embryos as soon as one hour 
after they were dissected from imbibed seeds. Mature 
endosperm, which Arabidopsis seeds do not have, but 
not seed coat was also stained (Figure 12A) . The root 
tips of some young seedlings of one transgenic line 
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were not stained (Figure 12B) . Otherwise, GUS 
expression patterns in developing tobacco seedlings 
were the same as in Arabidopsis seedlings (Figures 
12B. 12C, and 12D) . Newly formed structures such as 
lateral roots and true leaves were not stained. 

Since the observed strong activities of 
AtS21 promoter /GUS in both Arabidopsis and tobacco 
seedlings are not consistent with the seed- specif ic 
expression of oleosin genes. Northern analysis was 
carried out to determine if AtS21 mRNA was present in 
developing seedlings where the GUS activity was so 
strong. RNAs prepared from seedlings at different 
stages from 24 hours to 12 days were analyzed by 
Northern hybridization using AtS21 cDNA as the probe. 
Surprisingly, AtS21 mRNA was detected at a high level 
comparable to that in developing seeds in 24-48 hour 
imbibed seeds. The mRNA level dropped dramatically 
when young seedlings first emerged at 74 hours 
(Figures 13A and 13B) . In 96 hour and older 
seedlings, no signal was detected even with a longer 
exposure (Figure 133) . The loadings of RNA samples 
were checked by hybridizing the same blot with a 
tubulin gene probe (Figure 13C) which was isolated and 
identified by EST analysis as described in Example 2. 
Since AtS21 mRNA was so abundant in seeds, residual 
AtS21 probes remained on the blot even after extensive 
stripping. These results indicated that AtS21 mRNA 
detected in imbibed seeds and very young seedlings are 
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the carry-over of AtS21 mRNA from dry seeds. It has 
recently been reported that an oleosin Atol2 mRNA 
(identical to AtS21) is most abundant in dry seeds 
(Kirik. et al., 1996 Plant Mol. Biol. 31 .-413 - 417 . ) 
Similarly, the strong GUS activities in seedlings were 
most likely due to the carry-over of both p- 
glucuronidase protein and the de novo synthesis of |3- 
glucuronidase from its mRNA carried over from the dry 
seed stage. 
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Activity comparison between the 
AtS21 promoter and the 35S promoter 

The GUS activities in transgenic Arabidopsis 
developing seeds expressed by the AtS21 promoter were 
compared with those expressed by the 3 5S promoter in 
the construct pBl221 (Jefferson et al . EMBO J. 5.-3901- 
3907) . The seeds were staged according to their 
colors (Table 2) . The earliest stage was from 
globular to late heart stage when the seeds were still 
white but large enough to be dissected from the 
siliques. AtS21 promoter activity was detected at a 
level about three times lower than that of the 35S 
promoter at this stage. 35S promoter activity 
remained at the same low level throughout the entire 
embryo development. In contrast, AtS21 promoter 
activity increased quickly as the embryos passed 
torpedo stage and reached the highest level of 25,25 
pmole 4-MU/min. protein at mature stage (Figure 5- 
8) . The peak activity of the AtS21 promoter is as 
much as 210 times higher than its lowest activity at 
globular to heart stage, and is close to 100 times 
higher than the 35S promoter activity at the same 
stage (Table 2) . The activity levels of the AtS21 
promoter are similar to those of another Arabidopsis 
oleosin promoter expressed in Brassica napus (Plant et 
al. 1994, Plant mol . Biol. 25:193-205. AtS21 promoter 
activity was also detected at background level in 
leaf The high standard deviation, higher than the 
average itself, indicated that the GUS activity was 
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only detected in the leaves of some lines (Table 2) . 
on the other hand, 35S promoter activity in leaf was 
more than 20 times higher than that in seed. The side 
by side comparisons of activities between AtS21 
promoter and 35S promoter is shown in Figure 14. 

Although the AtS21 promoter activity was 
about 3 times lower in dry seed of tobacco than in 
Arabidopsis dry seed, the absolute GUS activity was 
still higher than that expressed by the 35S promoter 
in Arabidopsis leaf (Table 2) . No detectable AtS21 
promoter activity was observed in tobacco leaf (Figure 
14). 

Comparison of the AtS21 promoter versus the 
3 5S promoter revealed that the latter is not a good 
promoter to express genes at high levels in developing 
seeds. Because of its consistent low activities 
throughout the entire embryo development period, 35S 
promoter is useful for consistent low level expression 
of target genes. On the other hand, the AtS21 
promoter is a very strong promoter that can be used to 
express genes starting from heart stage embryos and 
accumulating until the dry seed stage. The 3 5S 
promoter, although not efficient, is better than the 
AtS21 promoter in expressing genes in embryos prior to 
heart stage. 
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EXAMPLE 9 

Exnression of the Borage A*-Desaturase Gene Under 
?S So5t?Sl of the AtS21 Promoter and Comparison to 
E^resSJSn under the Control of the CaMV 35S Promoter 

In order to create an expression construct 
with the AtS21 promoter driving expression of the 
borage a6 - desaturase gene, the GUS coding fragment 
from pAN5 was removed by digestion with Smal and 
EcoICR I. The cDNA insert of pANl (Example 2) was then 
excised by first digesting with Xhol (and filling in 
the residual overhang as above) , and then digesting 
with Smal. The resulting fragment was used to replace 
the excised portion of pAN5, yielding pAN3 . 

After transformation of tobacco and 
Arabidopsis following the methods of Example 7, levels 
of a' -desaturase activity were monitored by assaying 
the corresponding fatty acid methyl esters of its 
reaction products, y-linolenic acid (GLA) and 
octadecatetraenoic acid (OTA) using the methods 
referred to in Example 3. The GLA and OTA levels 
(Table 3) of the transgenic seeds ranged up to 6.7% of 
C18 fatty acids (Mean = 3.1%) and 2.8% (Mean = 1.1%), 
respectively. No GLA or OTA was detected in the 
leaves of these plants. In comparison, CaMV 35 S 
promoter/A"- desaturase transgenic plants produced GLA 
levels in seeds ranging up to 3.1% of C18 fatty acids 
(Mean = 1.3%) and no measurable OTA in seeds. 
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pYaMPT.Tg 10 

-*«^=n-ir.n of Oilseed Rape With an Expression 
CaSlt?rS^icS SompilsS the'^Oleosin 5- Regulato^J 
R^Sion LiJSid to the Borage Delta 6-Desaturase Gene 

Oilseed rape, Cv. Westar, was transformed 
with the strain of Agrobacterium tumefaciens EHA105 
containing the plasmid pAN3 (i.e. the borage A6- 
desaturase gene under the control of the Arabidopsxs 
oleosin promoter -Example 9) . 

Terminal internodes of Westar were co- 
cultivated for 2-3 days with induced Agrotacteriuin 
tumBfaciens strain EHA105 (Alt-Moerbe et al . 1988 Mol . 
Gen. Genet. 213:1-8; James et al . 1993 Plant Cell 
Reports 12:559-563), then transferred onto 
regeneration medium (Boulter etal . 1990 Plant Science 
70:91-99; Fry et al . 1987 Plant Cell Reports 6:321- 
325) The regenerated shoots were transferred to 
growth medium (Pelletier et al . 1983 Mol. Gen. Menet. 
191:244-250), and a polymerase chain reaction (PGR) 
test was performed on leaf fragments to assess the 

presence of the gene. 

DNA was isolated from the leaves according 
to the protocol of KM Haymes et al . (1996) Plant 
Molecular Biology Reporter 14(3) :280-284. and 
resuspended in lOOul of water, without RNase 
treatment. 5ul of extract were used for the PGR 
reaction, in a final volume of 50iil. The reaction was 
performed in a Perkin-Elmer 9600 thermocycler , with 
the following cycles: 
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1 cycle: 9 5°C, 5 minutes 

30 cycles: 95°C, 45 sec; 52''C. 45 sec 

72°C, 1 minute 

1 cycle: 72°C, 5 minutes 

and the following primers (derived from near the metal 
box regions, as indicated in Fig. 1. SEQ. N0.:1): 
5" TGG AAA TGG AAC CAT AA 3' 
5' GGA AAC AAA TGA TGC TC 3' 

Amplification of the DNA revealed the expected 549 
base pair PCR fragment (Figure 17) . 

The positive shoots were transferred to 
elongation medium, then to rooting medium (DeBlock et 
al 1989 Plant Physiol. 91 : 694 -701) . Shoots with a well- 
developed root system were transferred to the 
greenhouse. When plants were well developed, leaves 
were collected for Southern analysis in order to 
assess gene copy number. 

Genomic DNA was extracted according to the 
procedure of Bouchez et al . (1996) Plant Molecular 
Biology Reporter 14:115-123, digested with the 
restriction enzymes Bgl I and/or Cla I, 

electrophoretically separated on agarose gel (ManiatiS 
et al. 1982. in Molecular Cloning; a Laboratory 
Manual. Cold Spring Harbor Laboratory Press, Cold 
spring Harbor/NY) , and prepared for transfer to nylon 
membranes (Nytran membrane, Schleicher & Schuell) 
according to the instructions of the manufacturer. 
DNA was then transferred to membranes overnight by 
capillary action using 20XSSC (Maniatis et al . 1982). 
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Following transfer, the membranes were crosslinked by 
UV (Stratagene) for 30 seconds and pre -hybrid! zed for 
1 hour at 65°C in 15 ml of a solution containing 
6XSSC. 0.5%SDS and 2.25% w/w dehydrated skim milk m 
glass' vials in hybridization oven (Appligene) . The 
membranes were hybridized overnight in the same 
solution containing a denatured hybridization probe 
radiolabelled with "P to a specific activity of 10 
cpm/pg by the random primer method (with the Ready-To- 
Go kit obtained from Pharmacia) . The probe represents 
a PGR fragment of the borage delta 6-desaturase gene 
(Obtained in the conditions and with the primers 
detailed above) . After hybridization, the filters 
were washed at GS^C in 2XSSC, 0.1% SDS for 15 minutes, 
and 0.2XSSC. 0. 1%SDS for 15 minutes. The membranes 
were then wrapped in Saran-Wrap and exposed to Kodak 
XAR film using an intensifying screen at -70'^C in a 
light-proof cassette. Exposure time was generally 3 
days . 

The results obtained confirm the presence of 
the gene. According to the gene construct, the number 
of bands in each lane of DNA digested by Bgl I or Cla 
I represents the number of delta 6-desaturase genes 
present in the genomic DNA of the plant. The 
digestion with Bgl 1 and Cla 1 together generates a 

fragment of 3435 bp. 

The term "coinprises" or "comprising' Is 
defined as specifying the presence of the stated 
features. Integers, steps, or components as referred to 
1„ the claims, but does not preclude the presence or 
addition of one or more other features, integers, steps, 
components, or groups thereof. 
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SEQUENCE LISTING 



30 



(1) GENERAL INFORMATION: 

(i) APPLICANT: Rhone Poulenc Agro 

Thomas, Terry L. 
Li , Zhongsen 

m\ rpTTTE OF INVENTION: AN OLEOSIN 5' REGULATORY REGION FOR THE 

(3.x) TITLE OF INVENTiOW. j^^j^ jp3.j,^^jojj OF PLA^T SEED LIPID COMPOSITION 

(iii) NUMBER OF SEQUENCES: 35 

(iv) CORRESPONDENCE ADDRESS: t5^«„„«^ 

(A) ADDRESSEE: Scully, Scott, Murphy & Presser 

(B) STREET: 400 Garden City Plaza 

(C) CITY: Garden City 

(D) STATE: New York 

(E) COUNTRY: USA 

(F) ZIP: 11530 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy dxsk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 
(S) SOFTWARE: Patentin Release #1.0, Version #1 

(Vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/831.575 

(B) FILING DATE: 9 April 1997 

(C) CLASSIFICATION: 

(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: DiGiglio, Frank S. 

(B) REGISTRATION NUMBER: 31,346 

(C) REFERENCE/DOCKET NUMBER: 10203 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (516) 742-4343 

(B) TELEFAX: (516) 742-4366 

(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1684 base paxrs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

( ix) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION: 43.. 1387 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
ATATCTGCCT ACCCTCCCAA AGAGAGTAGT CATTTTTCAT CA ATG GCT GCT CAA 



1 



ATC AAG AAA TAC ATT ACC TCA GAT GAA CTC AAG AAC CAC GAT AAA CCC 
ill bys Tyr He Thr Ser Asp Glu Leu Lys Asn His Asp Lys Pro 

10 15 



5 



rCA GAT CTA TGG ATC TCG ATT CAA GGG AAA GCC TAT GAT GTT TCG GAT 

S?y ASP Leu ?rp lie Ser He Gin Gly Lys Ala Tyr Asp val Ser Asp 

25 30 = 

TGG GTG AAA GAC CAT CCA GGT GGC AGC TTT CCC TTG AAG AGT CTT GCT 
Trp val Lys Asp His Pro Gly Gly Ser Phe Pro Leu Lys Ser Leu Ala 
40 45 

GGT CAA GAG GTA ACT GAT GCA TTT GTT GCA TTC CAT CCT GCC TCT ACA 

Gly Gin Glu val Thr Asp Ala Phe Val Ala Phe His Pro Ala Ser Thr 
55 60 65 

rvrr AAG AAT CTT GAT AAG TTT TTC ACT GGG TAT TAT CTT AAA GAT TAC 

Leu ASP Lys Phe Phe Thr Gly Tyr Tyr Leu Lys Asp Tyr 
70 75 80 

TCT GTT TCT GAG GTT TCT AAA GAT TAT AGG AAG CTT GTG TTT GAG TTT 
ser val Ser Glu Val Ser Lys Asp Tyr Arg Lys Leu Val Phe Glu Phe 
85 90 

TCT AAA ATG GGT TTG TAT GAC AAA AAA GGT CAT ATT ATG TTT GCA ACT 
ser Lys Met Gly Leu Tyr Asp Lys Lys Gly His He Met Phe Ala Thr 

105 110 ■^^^ 

TTG TGC TTT ATA GCA ATG CTG TTT GCT ATG AGT GTT TAT GGG GTT TTG 
lei lys Phe He Ala Met Leu Phe Ala Met Ser Val Tyr Gly Val Leu 
120 125 

TTT TGT GAG GGT GTT TTG GTA CAT TTG TTT TCT GGG TGT TTG ATG GGG 
lys Glu Gly Val Leu Val His Leu Phe Ser Gly Cys Leu Met Gly 

TTT CTT TGG ATT CAG AGT GGT TGG ATT GGA CAT GAT GCT GGG CAT TAT 
pSe Leu Trp He Gin Ser Gly Trp He Gly His Asp Ala Gly His Tyr 
150 155 160 

ATG GTA GTG TCT GAT TCA AGG CTT AAT AAG TTT ATG GGT ATT TTT GCT 
Met vat Sa? sSr Asp Ser Arg Leu Asn Lys Phe Met Gly He Phe Ala 

GCA AAT TGT CTT TCA GGA ATA AGT ATT GGT TGG TGG AAA TGG AAC CAT 
m lys Leu Ser Gly He Ser He Gly Trp Trp Lys Trp Asn Hxs 

185 
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AAT GCA 
Asn Ala 



CAA TAT 
Gin Tyr 



ACC 
Thr 



TTC 
Phe 
245 

OCT 
Ala 



TCT 
Ser 
230 

TTT 
Phe 



AGO 
Arg 



AGA AAT 
Arg Asn 



TCG ATT 
Ser lie 



AGA ATT 

Arg lie 

310 

GTT GAG 
Val Gin 
325 

CCT AAA 

Pro L.ys 



ATT TCT 
lie Ser 



CAA ATT 
Gin He 



AAA ATC 
Lys He 
390 



CAT CAC ATT GCC TGT AAT AGC CTT GAA TAT GAG CCT GAT TTA 
His His He Ala Cys Asn Ser Leu Glu Tyr Asp Pro Asp Leu 
200 205 210 

ATA CCA TTC CTT GTT GTG TCT TCC AAG TTT TTT GGT TCA CTC 
He pro Phe Leu Val Val Ser Ser Lys Phe Phe Gly Ser Leu 
215 220 225 

CAT TTC TAT GAG AAA AGG TTG ACT TTT GAC TCT TTA TCA AGA 
His Phe Tyr Glu Lys Arg Leu Thr Phe Asp Ser Leu Ser Arg 

235 240 

GTA AGT TAT CAA CAT TGG ACA TTT TAC CCT ATT ATG TGT GCT 
Val Ser Tyr Gin His Trp Thr Phe Tyr Pro He Met Cys Ala 
250 255 260 

CTC AAT ATG TAT GTA CAA TCT CTC ATA ATG TTG TTG ACC AAG 
Leu Asn Met Tyr val Gin Ser Leu He Met Leu Leu Thr Lys 
265 270 275 

GTG TCC TAT CGA GCT CAG GAA CTC TTG GGA TGC CTA GTG TTC 
Val Ser Tvr Arg Ala Gin Glu Leu Leu Gly Cys Leu Val Phe 
280 285 290 

TGG TAC CCG TTG CTT GTT TCT TGT TTG CCT AAT TGG GGT GAA 
Tro Tvr Pro Leu Leu Val Ser Cys Leu Pro Asn Trp Gly Glu 
295 300 305 

ATG TTT GTT ATT GCA AGT TTA TCA GTG ACT GGA ATG CAA CAA 
Met t>he Val He Ala Ser Leu Ser Val Thr Gly Met Gin Gin 

315 320 

TTC TCC TTG AAC CAC TTC TCT TCA AGT GTT TAT GTT GGA AAG 

Phe Ser Leu Asn His Phe Ser Ser Ser Val Tyr Val Gly Lys 
330 335 340 

GGG AAT AAT TGG TTT GAG AAA CAA ACG GAT GGG ACA CTT GAC 
Gly Asn Asn Trp Phe Glu Lys Gin Thr Asp Gly Thr Leu Asp 
345 350 355 

TGT CCT CCT TGG ATG GAT TGG TTT CAT GGT GGA TTG CAA TTC 

Cys Pro Pro Trp Met Asp Trp Phe His Gly Gly Leu Gin Phe 
360 365 370 

GAG CAT CAT TTG TTT CCC AAG ATG CCT AGA TGC AAC CTT AGG 

Glu His His Leu Phe Pro Lys Met Pro Arg Cys Asn Leu Arg 
375 380 385 

TCG CCC TAC GTG ATC GAG TTA TGC AAG AAA CAT AAT TTG CCT 
ser pro Tyr Val He Glu Leu Cys Lys Lys His Asn Leu Pro 

395 400 



TAC AAT 

Tyr Asn 
405 



TAT GCA TCT TTC TCC AAG GCC AAT GAA ATG ACA CTC AGA ACA 

?yr Ala Ser Phe Ser Lys Ala Asn Glu Met Thr Leu Arg Thr 
410 '^IS 



678 



726 



774 



822 



870 



918 



966 



1014 



1062 



1110 



1158 



1206 



1254 



1302 



pro Ala Ser Thr Trp Lys Asn Leu Asp Lys Phe Phe Thr Gly Tyr Tyr 
65 "70 75 80 

Leu Lys ASP Tyr Ser val Ser Glu Val Ser Lys Asp Tyr Arg Lys Leu 

85 9^ 

val Phe Glu Phe Ser Lys Met Gly Leu Tyr Asp Lys Lys Gly His He 
100 105 110 

Met Phe Ala Thr Leu Cys Phe lie Ala Met Leu Phe Ala Met Ser Val 

Tyr Gly Val Leu Phe Cys Glu 
130 135 



1684 
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T-vn ACG AAC ACA GCA TTG CAG GCT AGG GAT ATA ACC AAG CCG CTC CCG 1350 
Leu A?g T^i Ala Leu Gin Ala Arg Asp He Thr Lys Pro Leu Pro 

AAG AAT TTG GTA TGG GAA GCT CTT CAC ACT CAT GGT T AAAATTACCC 1397 
£Cs Asn Leu val Trp Glu Ala Leu His Thr His Gly 
440 445 

TTAGTTCATG TAATAATTTG AGATTATGTA TCTCCTATGT TTGTGTCTTG TCTTGGTTCT 1457 

ACTTGTTGGA GTCATTGCAA CTTGTCTTTT ATGGTTTATT AGATGTTTTT TAATATATTT 1517 

TAGAGGTTTT GCTTTCATCT CCATTATTGA TGAATAAGGA GTTGCATATT GTCAATTGTT 1577 

GTGCTCAATA TCTGATATTT TGGAATGTAC TTTGTACCAC GTGGTTTTCA GTTGAAGCTC 1637 
ATGTGTACTT CTATAGACTT TGTTTAAATG GTTATGTCAT GTTATTT 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 5 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 

Met Ala Ala Gin He Lys Lys Tyr He Thr Ser Asp Glu Leu Lys Asn 
1 5 10 1= 

His Asp Lys Pro Gly Asp Leu Trp lie Ser He Gin Gly Lys Ala Tyr 
20 25 -5" 

ASP val ser Asp Trp Val Lys Asp His Pro Gly Gly Ser Phe Pro Leu 

Lys ser Leu Ala Gly Gin Glu Val Thr Asp Ala Phe Val Ala Phe His 
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(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 834 base pairs 

(B) TYPE: nucleic acid. 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 31.. 603 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
TTAOCC^A CTCTATAO^ ^GATAGAC ATO GCG AAT GTO GAT COT GAT CGG 

140 

- III ?s ^ i iss ^5 
sts S i s?j ^ sf. 1 
S ^2 i'^'^ s sts ?re St 

St fo If. i St 5S £st |v ^ 
St s?s S 1 ^ IS? ?Tt 11 ss 

210 

TeS si? ill SI? S2 STu g tre 

225 230 

_ ^f,™ TTG TTT GGG TTG ACG GGT CTG AGO 

SI? ?S? 1?^ ;?I SI l?J TeS Ihe Gly Thr CXy Se. 
240 245 " 

- 51? IS? S SI? S ttS ?J5 S Ii S?? ?S S5 SI. 



54 

102 

150 

198 

246 

294 

342 

390 

438 
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CCA GAG CAA TTG GAC TAC GCT AAA CGG CGT ATG GCT GAT GCG GTA GGC 486 
5ro Glu Gin Leu Asp Tyr Ala Lys Arg Arg Met Ala Asp Ala Val Gly. 
275 280 285 

TAT GCT GGT ATG AAG GGA AAA GAG ATG GGT GAG TAT GTG CAA GAT AAG 534 
Tvr Ala Gly Met Lys Gly Lys Glu Met Gly Gin Tyr Val Gin Asp Lys 
290 295 300 

GCT CAT GAG GCT CGT GAG ACT GAG TTC ATG ACT GAG ACC CAT GAG CCG 582 
Ala His Glu Ala Arg Glu Thr Glu Phe Met Thr Glu Thr His Glu Pro 
305 310 315 

GGT AAG GCC AGG AG A GGC TCA TAAGCTAATA TAAATTGCGG GAGTCAGTTG 633 
Gly Lys Ala Arg Arg Gly Ser 
320 325 

GAAACGCGAT AAATGTAGTT TTACTTTTAT GTCCCAGTTT CTTTCCTCTT TTAAGAATAT 693 
CTTTGTCTAT ATATGTGTTC GTTCGTTTTG TCTTGTCCAA ATAAAAATCC TTGTTAGTGA 753 
AATAAGAAAT GAAATAAATA TGTTTTCTTT TTTGAGATAA CCAGAAATCT CATACTATTT 813 

834 

TCTAAAAAAA AAAAAAAAAA A 

(2) INFORMATION FOR'SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A> LENGTH: 191 amino acids 
<B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Ala Asn Val Asp Arg Asp Arg Arg Val His Val Asp Arg Thr Asp 
15 10 15 

Lvs Arg val His Gin Pro Asn Tyr Glu Asp Asp Val Gly Phe Gly Gly 
^20 25 30 

Tyr Gly Gly Tyr Gly Ala Gly Ser Asp Tyr Lys Ser Arg Gly Pro Ser 
35 40 45 

Thr Asn Gin He Leu Ala Leu He Ala Gly Val Pro He Gly Gly Thr 
50 55 60 

Leu Leu Thr Leu Ala Gly Leu Thr Leu Ala Gly Ser Val He Gly Leu 
65 70 75 80 

Leu val Ser He Pro Leu Phe Leu Leu Phe Ser Pro Val He Val Pro 

85 90 95 
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Ala Ala Leu Thr He Gly Leu Ala val Thr Gly He Leu Ala Ser Gly 

Leu Phe Gly Leu Thr Gly Leu Ser Ser Val Ser Trp Val Leu Asn Tyr 
115 ^20 

Leu Arg Gly Thr Ser Asp Thr Val Pro Glu Gin Leu Asp Tyr Ala Lys 
130 135 140 

Arg Arg Met Ala Asp Ala Val Gly Tyr Ala Gly Met Lys Gly Lys Glu 
145 150 13^ 

Met Gly Gin Tyr Val Gin Asp Lys Ala His Glu Ala Arg Glu Thr Glu 

165 170 J- '3 

Phe Met Thr Glu Thr His Glu Pro Gly Lys Ala Arg Arg Gly Ser 
180 185 190 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Met Ala Asn Val Asp Arg Asp Arg Arg Val His Val Asp Arg Thr Asp 
1 5 10 15 

Lvs Arg val His Gin Pro Asn Tyr Glu Asp Asp Val Gly Phe Gly Gly 
20 25 30 

Thr Gly Gly Thr Gly Ala Gly Ser Asp Tyr Lys Ser Arg Gly Pro Ser 
35 40 45 

Thr Asn Gin He Leu Ala Leu He Ala Gly Val Pro He Gly Gly Thr 
50 55 60 

Leu He Thr Leu Ala Gly Leu Thr Leu Ala Gly Ser Val He Gly Leu 
65 70 75 80 

Leu val ser He Pro Leu Phe Leu He Phe Ser Pro Val He Val Pro 

At- on 9 5 



85 



Ala Ala Leu Thr He Gly Leu Ala Val Thr Gly He Leu Ala Ser Gly 
100 105 HO 

Leu Phe Gly Leu Thr Gly Leu Ser Ser Val Ser Trp Val Leu Asn Tyr 
115 120 125 



PCT/US98/07179 

WO 98/45461 ^^5. 



Arg Gly Thr Ser A.P Thr Val Pro Glu Gin Lju Asp Tyr Ala T-ys 
130 

Arg Arg Met Ala Asp Ala Val Gly Tyr Ala Gly Met i,ys Gly Lys Glu 
145 150 

Met Gly Gin Tyr Val Gin Asp Lys Ala His Glu Ala Arg Glu Thr Glu 
Phe Met Thr Glu Thr His Glu Pro Gly i.ys Ala Arg Arg Gly ser 



180 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

Met Ala Asn val Asp Arg Asp Arg Arg Val His Val Asp Arg Thr Asp 
1 5 

Lys Arg Val His Gin Pro Asn Tyr Glu Asp Asp Val Gly Phe Gly Gly 
20 

Thr Gly Gly Thr Gly Ala Gly Ser Asp Tyr Lys Ser Arg Gly Pro Ser 

35 40 
Thr Asn Gin He Leu Ala Leu He Ala Gly Val Pro He Gly Gly Thr 

50 55 
Leu He Thr Leu Ala Gly Leu Thr Leu Ala Gly Ser Val He Gly Leu 
65 ''0 

Leu val ser He Pro Leu Phe Leu He Phe Ser Pro Val He Val Pro 

85 

Ala Ala feu Thr He Gly I^u Ala val Thr Gly He feu Ala Ser Gly 

100 -^"^ 
feu Phe Gly feu Thr Gly feu Ser Ser Val Ser Trp Val feu Asn Tyr 

feu Arg Gly Thr Ser Asp Thr Val Pro Glu Gin feu Asp Tyr Ala fys 

«g «g Met Al. ASP Ala val Gly Tyr Ala Gly Met fys Gly fys Glu 

145 ■••5° 
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Met Gly Gin Tyr Val Gin Asp Lys Ala His Glu Ala Arg Glu Thr Glu 

Phe Met Thr Glu Thr His Glu Pro Gly Lys Ala Arg Arg Gly Pro 
180 

INFORMATION FOR SEQ ID NO:7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 78 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
Phe Gly Leu Thr Gly Leu Ser Ser Val Ser Trp Val Leu Gin Leu Pro 

pro Trp Ala Ser Asp Thr Val Pro Glu Gin Val Asp Tyr Ala Lys Arg 
20 25 J" 

Arg Met Ala Asp Ala Val Gly Tyr Ala Gly Met Lys Gly Lys Glu Met 



35 



Glv Gin Tyr Val Gin Asp Lys Ala His Glu Ala Arg Glu Thr Glu Phe 
50 55 60 

Met Thr Glu Thr His Glu Pro Gly Lys Ala Arg Arg Gly Ser 
65 70 75 

INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 173 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : s ingl e 
{ D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Met Ala ASP Thr Ala Arg Gly Thr His His Asp He He Gly Arg Asp 
1 5 10 

Gin Tyr Pro Met Met Gly Arg Asp Arg Asp Gin Tyr Gin Met Ser Gly 
20 25 -3" . 

Arg Gly Ser Asp Tyr Ser Lys Ser Arg Gin He Ala Lys Ala Ala Thr 
35 40 
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Ala Val Thr Ala Gly Gly Ser Leu Leu Val Leu Ser Ser Leu Thr Leu 
50 55 60 

Val Gly Thr Val Leu Ala Leu Thr Val Ala Thr Pro Leu Leu Val Leu 
65 70 75 80 

Phe ser Pro lie Leu Val Pro Ala Leu lie Thr Val Ala Leu Leu lie 

85 90 

Thr Gly Phe Leu Ser Ser Gly Gly Phe Gly He Ala Ala lie Thr Val 
100 105 110 

Phe Ser Trp He Tyr Lys Tyr Ala Thr Gly Glu His Pro Gin Gly Ser 
115 120 125 

ASD Lys Leu Asp Ser Ala Arg Met Lys Leu Gly Ser Lys Ala Gin Asp 
130 135 140 

Leu Lys ASP Arg Ala Gin Tyr Tyr Gly Gin Gin His Thr Gly Gly Glu 
145 150 155 160 

His Asp Arg Asp Arg Thr Arg Gly Gly Gin His Thr Thr 

165 170 

INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 141 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : S ingle 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Met Ala Asp Gin Thr Arg Thr His His Glu Met. He Ser Arg Asp Ser 
1 5 10 15 

Thr Gin Glu Ala His Pro Lys Ala Arg Gin Trp Val Lys Ala Ala Thr 
20 25 30 

Ala Val Thr Ala Gly Gly Ser Leu Leu Val Leu Ser Gin Leu Thr Leu 
35 40 45 

Ala Gly Thr Val He Ala Leu Thr Val Ala Thr Pro Leu Leu Val He 
50 55 60 

Phe Ser Pro Val Leu Val Pro Ala Val Val Thr Val Ala Leu He He 
65 70 75 80 

Thr Gly Phe Leu Ala Ser Gly Gly Phe Gly He Ala Ala He Thr Ala 
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Phe ser Trp Leu Tyr Arg His Trp Thr Gly Ser Gly Ser Asp Lys He 
100 105 110 

Glu Trp Ala Arg Met Lys Val. Gly Ser Arg Val Gin Asp Thr Lys Tyr 
115 120 125 

Glv Gin His Trp He Gly Val Gin His Gin Gin Val Ser 
130 135 140 

INFORMATION FOR SEQ ID NO: 10: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 199 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Met Ala Asp Thr His Arg Val Asp Arg Thr Asp Arg His Phe Gin Phe 
1 5 10 15 

Gin ser Pro Tyr Glu Gly Gly Arg Gly Gin Gly Gin Tyr Glu Gly Asp 
20 25 30 

Arg Gly Tyr Gly Gly Gly Gly Tyr Lys Ser Met Met Pro Glu Ser Gly 
' 35 40 45 

Pro Ser Ser Thr Gin Val Leu Ser Leu Leu He Gly Val Pro Val Val 
50 55 60 

Glv Ser Leu He Ala Leu Ala Gly Leu Leu Leu Ala Gly Ser Val He 
65 70 75 80 

Gly Leu Met Val Ala Leu Pro Leu Phe Leu He Phe Ser Pro Val He 

85 90 95 

Val pro Ala Gly Leu Thr He Gly Leu Ala Met Thr Gly Phe Leu Ala 
100 105 110 

Ser Gly Met Phe Gly Leu Thr Gly Leu Ser Ser He Ser Trp Val Met 
115 120 125 

Asn Tyr Leu Arg Gly Thr Ala Arg Thr Val Pro Glu Gin Leu Glu Tyr 
130 135 140 

Ala Lys Arg Arg Met Ala Asp Ala Val Gly Tyr Ala Gly Gin Lys Gly 
145 150 155 160 

LVS Glu Met Gly Gin His Val Gin Asn Lys Ala Gin Asp Val Lys Gin 
^ 165 170 175 
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Tvr ASP lie ser Lys Pro His Asp Thr Thr Thr Lys Gly His Glu Thr 
180 185 1='" 

Gin Gly Gly Thr Thr Ala Ala 
195 

INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 199 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Met Ala Asp Thr His Arg Val Asp Arg Thr Asp Arg His Phe Gin Phe 
1 5 10 15 

Gin Ser Pro Tyr Glu Gly Gly Arg Gly Gin Gly Gin Tyr Glu Gly Asp 
20 25 30 

Arg Gly Tyr Gly Gly Gly Gly Tyr Lys Ser Met Met Pro Glu Ser Gly 
35 40 45 

Pro ser Ser Thr Gin Val Leu Ser Leu Leu lie Gly Val Pro val Val 
50 55 60 

Gly Ser Leu He Ala Leu Ala Gly Leu Leu He Ala Gly Ser Val lie 
65 70 75 80 

Glv Leu Met Val Ala Leu Pro Leu Phe Leu He Phe Ser Pro Val He 

85 90 95 

Val Pro Ala Ala Leu Thr He Gly Leu Ala Met Thr Gly Phe Leu Ala 
100 105 110 

Ser Glv Met Phe Gly Leu Thr Gly Leu Ser Ser He Ser Trp Val Met 
115 120 125 

Asn Tvr Leu Arg Gly Thr Arg Arg Thr Val Pro Glu Gin Leu Glu Tyr 
130 135 140 

Ala Lys Arg Arg Met Ala Asp Ala Val Gly Tyr Ala Gly Gin Lys Gly 
145 150 155 160 

Lys Glu Met Gly Gin His Val Gin Asn Lys Ala Gin Asp Val Lys Gin 



165 170 175 

180 



Tyr ASP He Ser Lys Pro His Asp Thr Thr Thr Lys Gly His Glu Thr 
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Gln Gly Arg Thr Thr Ala Ala 
195 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 1267 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

GAGCTCGATC ACACAAAGAA AACGTCAAAT GGATCATACT GGGCCCATTT TGCAGACCAA 60 

GAGAAAGTGA GAGAGAGTTG TCCTCTCGTT ATCAAGTAAC AGTAGACCAC CACTAAACCG 120 

CCAATAGCTT ATAATCAAAA TAGAAAGGTC TAATAACAGA AACAAATGAA AAAGCCTTGT 180 

TCCATGGACT GCCTACCCGA ATTGATTGAT TCGACTAGTT TTTCTTCTTC TTTGATTAAG 240 

ACCTCCGTAA GAAAAATGGT ACTACTAAAG CCACTCGCTA CCAAAACTAA ACCATTCCAG 300 

ACTGTAACTG GACCAATATT TCTAAACTGT AACCAGATCT CAAACATATA AACTAATTAA 360 

GAACTATAAC CATTAACCGT AAAAATAAAT TTACTACAGT AAAAAATTAT ACTAATTTCA 420 

GCTATGATGG AATTTCAGCT CTTAAGAGTT GTGGAAATCA AGTAAACCTA AAATCCTAAT 480 

AATATTCTTC ATCCTTATTT TTGTTTCACA TGCATGCTGT CCAATCTGTT ATTAGCATTT 540 

GAAAGCCTAA AATTCTATAT ACAGTACAAT AAATCTAATT AATTTTCATT ACTAATAAAA 600 

TGCTTCATAT ATACTCTTGT ATTTATAAAT CATCCGTTAT CGTTACTATA CCTTTATACA 660 

TCATCCTACA TTCATACCTA AGCTAGCAAA GCAAACTACT AAAAGGGTCG TCAACGCAAG 720 

TTATTTGCTA GTTGGTGCAT ACTACACACG GCTACGGCAA CATTAAGTAA CACATTAAGA 780 

GGTGTTTTCT TAATGTAGTA TGGTAATTAT ATTTATTTCA AAACTTGGAT TAGATATAAA 840 

GGTACAGGTA GATGAAAAAT ATTTGGTTAG CGGGTTGAGA TTAAGCGGAT ATAGGAGGCA 900 

TATATACAGC TGTGAGAAGA AGAGGGATAA ATACAAAAAG GGAAGGATGT TTTTGCCGAC 960 

AGAGAAAGGT AGATTAAGTA GGCATCGAGA GGAGAGCAAT TGTAAAATGG ATGATTTGTT 1020 

TGGTTTTGTA CGGTGGAGAG AAGAACGAAA AGATGATCAG GTAAAAAATG AAACTTGGAA 1080 

ATCATGCAAA GCCACACCTC TCCCTTCAAC ACAGTCTTAC GTGTCGTCTT CTCTTCACTC 1140 

CATATCTCCT TTTTATTACC AAGAAATATA TGTCAATCCC ATTTATATGT ACGTTCTCTT 1200 
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AGACTTATCT CTATATACCC CCTTTTAATT TGTGTGCTCT TAGCCTTTAC TCTATAGTTT 
TAGATAG 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
GTAATACGAC TCACTATAGG GC 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
GGGGATCCTA TACTAAAACT ATAGAGTAAA GG 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Trp He Gly His Asp Ala Gly His 
1 5 
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INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Asn Val Gly His Asp Ala Asn His 
1 5 



INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Val Leu Gly His Asp Cys Gly His 
1 5 



INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

val lie Ala His Glu Cys Gly His 
1 5 



) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

( C ) STR ANDEDNES S : s ingl e 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 

{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

val He Gly His Asp Cys Ala His 
1 5 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 

Val Val Gly His Asp Cys Gly His 
1 5 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

His Asn Ala His His 
1 5 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : S ingl e 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

His Asn Tyr Leu His His 
1 5 
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INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

His Arg Thr His His 
1 5 



INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

His Arg, Arg His His 
1 5 



INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

His Asp Arg His His 
1 5 



INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

His Asp Gin His His 
1 5 



(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

His Asp His His His 
1 5 



(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: 

His Asn His His His 
1 5 



(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

( C ) STR ANDEDNES S : doubl e 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 



Phe Gin lie Glu His His 
1 5 
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(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : s ingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

His Gin Val Thr His His 
1 5 



(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

{ C ) STRANDEDNESS : single 
( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

His Val He His His 
1 5 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : s ingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

His Val Ala His His 
1 5 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNES S : s ingl e 

( D ) TOPOLOGY : 1 inear 
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(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

His He Pro His His 
1 5 



(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

His Val Pro His His 
1 5 

(2) INFORMATION FOR SEQ ID NO: 35:. 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1941 base pairs 

(B) , TYPE: nucleic acid 

( C ) STRANDEDNESS : doubl e 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 



GAGCTCGATC 


ACACAAAGAA 


AACGTCAAAT 


GGATCATACT 


GGGCGCATTT 


TGCAGACCAA 


60 


GAGAAAGTGA 


GAGAGAGTTG 


TCCTCTCGTT 


ATCAAGTAAC 


AGTAGACCAC 


CACTAAACCG 


120 


CCAATAGCTT 


ATAATCAAAA 


TAGAAAGGTC 


TAATAACAGA 


AACAAATGAA 


AAAGCCTTGT 


180 


TCCATGGACT 


GCCTACCCGA 


ATTGATTGAT 


TCGACTAGTT 


TTTCTTCTTC 


TTTGATTAAG 


240 


ACCTCCGTAA 


GAAAAATGGT 


ACTACTAAAG 


CCACTCGCTA 


CCAAAACTAA 


ACCATTCCAG 


300 


ACTGTAACTG 


GACCAATATT 


TCTAAACTGT 


AACCAGATCT 


CAAACATATA 


AACTAATTAA 


360 


GAACTATAAC 


CATTAACCGT 


AAAAATAAAT 


TTAGTACAGT 


7VAAAAATTAT 


ACTAATTTCA 


420 


GCTATGATGG 


AATTTCAGCT 


CTTAAGAGTT 


GTGGAAATCA 


AGTAAACCTA 


AAATCCTAAT 


480 


AATATTCTTC 


ATCCTTATTT 


TTGTTTCACA 


TGCATGCTGT 


CCAATCTGTT 


ATTAGCATTT 


540 


GAAAGCCTAA 


AATTCTATAT 


ACAGTAC-AAT 


AAATCTAATT 


AATTTTCATT 


ACTAATAAAA 


600 
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TGCTTCATAT 
TCATCCTACA 
TTATTTGCTA 
GGTGTTTTCT 
GGTACAGGTA 
TATATACAGC 
AGAGAAAGGT 
TGGTTTTGTA 
ATCATGCAAA 
CATATCTCCT 
AGACTTATCT 
TAGATAGACA 
CGTGTTCATC 
GCTGGTTCTG 
TAGTTTTTCT 
ATTCATTTTA 
TTAATTATTT 
CTAATTGTTC 
TTATCATGGT 
GCACTTATAG 
CGGTTCGGTG 
AGTCCCGGCG 
TGGGTTGACG 



ATACTCTTGT 
TTCATACCTA 
GTTGGTGCAT 
TAATGTAGTA 
GATGAAAAAT 
TGTGAGAAGA 
AGATTAAGTA 
CGGTGGAGAG 
GCCACACCTC 
TTTTATTACC 
CTATATACCC 
TGGCGAATGT 
AGCCAAACTA 
ATTATAAGAG 
TGTGTTTTCC 
AACAGAAAGA 
CCTTTTAGTT 
ACAAAATGAG 
GATGCATGCT 
CAGGAGTCCA 
ATCGGCTTGC 
GCTCTCACTA 
GGTCTGAGCT 



ATTTATAAAT 
AGCTAGCAAA 
ACTACACACG 
TGGTAATTAT 
ATTTGGTTAG 
AGAGGGATAA 
GGCATCGAGA 
AAGAACGAAA 
TCCCTTCT^C 
AAGAAATATA 
CCTTTTAATT 
GGATCGTGAT 
CG7VAGATGAT 
TCGCGGCCCC 
TATGATCACG 
TAAATAAAAT 
CTTAAGTCCT 
TAAAGTTTGA 
TGTTAGATAA 
TTGGTGGCAC 
TAGTCTCCAT 
TTGGGCTTGC 
C 



-78- 
CATCCGTTAT 
GCAAACTACT 
GCTACGGCAA 
ATTTATTTCA 
CGGGTTGAGA 
ATACAAAAAG 
GGAGAGCAAT 
AGATGATCAG 
ACAGTCTTAC 
TGTCAATCCC 
TGTGTGCTCT 
CGGCGTGTGC 
GTCGGTTTTG 
TCCACTT^CC 
CTCTCCAAAC 
AGTGAAGAAC 
AATTAGGATT 
AACAGATTTT 
ACTCGATATA 
ACTGCTAACC 
ACCCCTCTTC 
TGTGACGGGA 



CGTTACTATA 
AAAAGGGTCG 
CATTAAGTAA 
AAACTTGGAT 
TTAAGCGGAT 
GGAAGGATGT 
TGTAAAATGG 
GTAAAAAATG 
GTGTCGTCTT 
ATTTATATGT 
TAGCCTTTAC 
ATGTAGACCG 
GTGGCTATGG 
T^GTATTTTT 
TATTTGAAGA 
CATAGGAATC 
CCTTTAAAAG 
TATACACCAC 
ATCAATACAT 
CTAGCTGGAC 
CTCCTCTTCA 
ATCTTGGCTT 



CCTTTATACA 660 

TCAACGCAAG 720 

CACATTAAGA 780 

TAGATATAAA 840 

ATAGGAGGCA 900 

TTTTGCCGAC 960 

ATG ATTTGTT 1020 

AAACTTGGAA 1080 

CTCTTCACTC 1140 

ACGTTCTCTT 1200 

TCTATAGTTT 1260 

TACTGACAAA 1320 

CGGTT ATGGT 1380 

GTGGTCTCTT 1440 

TTTTCTGTAA 1500 

GT ACGTT ACG 1560 

TTGCAACAAT 1620 

TTGC ATATGT 1680 

GC AG ATCTTG 17 40 

TCACTCTAGC 1800 

GTCCGGTGAT 1860 

CTGGTTTGTT 1920 
1941 
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What is Claimed is: 

1. An isolated nucleic acid encoding an 
oleosin 5* regulatory region which directs seed- 
specific expression selected from the groups 
consisting of the nucleotide sequence set forth in SEQ 
ID NO: 12, the nucleotide sequence set forth in SEQ ID 
NO: 12 having an insertion, deletion, or substitution 
of one or more nucleotides, or a contiguous fragment 
of the nucleotide sequence set forth in SEQ ID NO: 12. 

2. An expression cassette which comprises 
the oleosin 5' regulatory region of Claim 1 operably 
linked to at least one of a nucleic acid encoding a 
heterologous gene or a nucleic acid encoding a 
sequence complementary to a native plant gene. 

3 . The expression cassette of Claim 2 
wherein the heterologous gene is at least one of a 
fatty acid synthesis gene or a lipid metabolism gene. 

4. The expression cassette of Claim 3 
wherein the heterologous gene is selected from the 
group consisting of an acetyl -coA carboxylase gene, a 
ketoacyl synthase gene, a malonyl transacylase gene, a 
lipid desaturase gene, an acyl carrier protein (ACP) 
gene, a thioesterase gene, an acetyl transacylase 
gene, or an elongase gene. 

5. The expression cassette of Claim 4 
wherein the lipid desaturase gene is selected from the 
group consisting of a A6 -desaturase gene, a A12- 
desaturase gene, and a A15 - desaturase gene. 

6 . An expression vector which comprises the 
expression cassette of any one of Claims 2-5. 
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7. A cell comprising the expression 
cassette of any one of Claims 2-5. 

8. A cell comprising the expression vector 

of Claim 6 . 

9. The cell of Claim 7 wherein said cell is 
a bacterial cell or a plant cell. 

10. The cell of Claim 8 wherein said cell 
is a bacterial cell or a plant cell. 

11. A transgenic plant comprising the 
expression cassette of any one of Claims 2-5. 

12 . A transgenic plant comprising the 
expression vector of Claim 6. 

13. A plant which has been regenerated from 

the plant cell of Claim 9. 

14. A plant which has been regenerated from 

the plant cell of Claim 10. 

15. The plant of Claim 12 or 13 wherein 
said plant is at least one of a sunflower, soybean, 
maize, cotton, tobacco, peanut, oil seed rape or 
Arabidopisis plant. 

16. Progeny of the plant of Claim 11 or 12. 

17. Seed from the plant of Claim 11 or 12. 

18. A method of producing a plant with 
increased levels of a product of a fatty acid 
synthesis gene or a lipid metabolism gene which 
comprises: 

(a) transforming a plant cell with an 
expression vector comprising the isolated nucleic acid 
of Claim 1 operably linked to at least one of an 
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isolated nucleic acid coding for a fatty acid 
synthesis gene or a lipid metabolism gene; and 

(b) regenerating a plant with increased 
levels of the product of said fatty acid synthesis or 
said lipid metabolism gene from said plant cell. 

19. A method of producing a plant with 
increased levels of gamma linolenic acid (GLA) content 

which comprises: 

(a) transforming a plant cell with an 
expression vector comprising the isolated nucleic acid 
of Claim 1 operably linked to a A6-desaturase gene; 
and 

(b) regenerating a plant with increased 
levels of GLA from said plant cell. 

20. The method of Claim 19 wherein said A6- 
desaturase gene is at least one of a cyanobacterial 
A6-desaturase gene or a Borage A6 -desaturase gene. 

21. The method of any one of Claims 18-20 
wherein said plant is a sunflower, soybean, maize, 
tobacco, cotton, peanut, oil seed rape or Arabidopsis 
plant. 

22. The method of Claim 18 wherein said 
fatty acid synthesis gene or said lipid metabolism 
gene is at least one of a lipid desaturase, an acyl 
carrier protein (ACP) gene, a thioesterase gene an 
elongase gene, an acetyl transacylase gene, an acetyl - 
coA carboxylase gene, a ketoacyl synthase gene, or a 
malonyl transacylase gene. 

23. A method of inducing production of at 
least one of gamma linolenic acid (GLA) or 
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octadecatetraeonic acid (OTA) in a plant deficient or 
lacking in GLA which comprises transforming said plant 
with an expression vector comprising an the isolated 
nucleic acid of Claim 1 operably linked to a A6- 
desaturase gene and regenerating a plant with 
increased levels of at least one of GLA or OTA. 

24. A method of decreasing production of a 
fatty acid synthesis or lipid metabolism gene in a 
plant which comprises; 

(a) transforming a plant cell with an 
expression vector comprising the isolated nucleic acid 
of Claim 1 operably linked to a nucleic acid sequence 
complementary to a fatty acid synthesis or lipid 
metabolism gene; and 

(b) regenerating a plant with decreased 
production of said fatty acid synthesis or said lipid 

metabolism gene. 

25. A method of cosuppressing a native 
fatty acid synthesis or lipid metabolism gene in a 
plant which comprises: 

(a) transforming a cell of the plant with an 
expression vector comprising the isolated nucleic acid 
of Claim 1 operably linked to a nucleic acid sequence 
encoding a fatty acid synthesis or lipid metabolism 
gene native to the plant; and 

(b) regenerating a plant with decreased 
production of said fatty acid synthesis or said lipid 
metabolism gene. 
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FIG. 2 
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Mammalian cDNA fragments putatively encoding 
amino acid sequences characteristic of the fatty acid 

I desaturase were obtained using expressed sequence 
j^ag (EST) sequence informations. These fragments 
were subsequently used to screen a rat liver cDNA 
library, yielding a 1573-bp clone. Expression of DNA 
fragment containing either of two possible open read- 
ing frames (nucleotide numbers 97-1431 and 148-1431) 
of the isolated clone in yeast led to the accumulation of 
y-linolenic acid in the presence of exogenous linoleic 
acid. In this system, the addition of a-linolenic acid 
also resulted in the accumulation of its A-6 desatu- 
rated product whereas dihomo-y-linolenic acid failed 
to be a substrate. These results indicate that the pro- 
tein encoded by the rat cDNA is A-6 fatty acid desatu- 
rase, and the first 17 amino acids corresponding to the 
coding region 97-147 of the clone are not required to 

function in yeast. O 1999 Academic Press 



A-6 Desaturase catalyzes the conversion of linoleic 
acid (LA, C18:2A-9, 12) to 7-linolenic acid (GLA, C18: 

«A-6, 9, 12) by inserting a double bond between carbon 
and 7, in conjunction with C3^ochrome 65-mediated 
electron transfer system in mammals. Since GLA and 
its elongation product, dihomo-7-linolenic acid (DGLA, 
C20:3A-8, 11, 14), are barely detectable in mammalian 
cells, it is generally accepted that the A-6 desaturation 
step is rate-limiting (1). In this context, the activity of 
the A-6 desaturase is considered to affect directly to the 
cellular content of arachidonic acid (AA, C20:4A-5, 8, 
11, 14) which is a A-5 desaturated product of DGLA. It 
is feasible to extend this aspects on the n-6 pathway to 
another pathway (n-3) where a-linolenic acid (ALA, 
Cl8:3A-9, 12, 15) is converted to eicosapentaenoic acid 
(EPA, C20:5A-5, 8, 11, 14, 17) through A-6 desatura- 
tion. 

' To whom correspondence should be addressed. Fax: +81-824-22- 
7191. E-mail: aki@ipc.hiroshima-u.ac.jp. 




AA is well-known as a precursor of a large family of 
eicosanoids that have multiple effects related to the 
regulation of e.g. blood pressure, inflammatory reac- 
tions, and platelet function (1-3). EPA exhibits antag- 
onizing effect aganst AA metabolism, and vise versa (4, 
5), Since the amounts and types of eicosanoids synthe- 
sized are partially determined by the availability of the 
fatty acid precursors, imbalance of these acids is sug- 
gested to contribute to numerous clinical symptoms. 
An early study indicated that affinity of the A-6 desatu- 
rase for ALA is greater than that for LA, implying that 
these fatty acids might not be metabolized in the same 
fashion (6). Therefore, the imbalance of the levels of 
fatty acid precursors could be due to the impaired 
activity of the A-6 desaturase on either of the two 
pathways. Indeed, depression of the A-6 desaturase 
activity, mainly reported on the n-6 pathway, is asso- 
ciated with various physiologic and pathophysiologic 
states including aging, diabetes, atopic dermatitis, car- 
diovascular disorders, and cancer (1, 7, 8). Also, the 
differences in nutritional and hormonal conditions in- 
fluence the A-6 desaturase activity, resulting in the 
altered composition of intracellular fatty acids and 
membrane phospholipids (1, 9). Up to now, these ob- 
servations have been led, in part, by tracing the activ- 
ity of the enzyme, detected predominantly in the mi- 
crosomal membrane fraction. However, molecular 
characterization of the membrane-bound desaturase 
protein especially in mammals has not been fully es- 
tablished. 

Recently, genes coding for A-6 desaturases from the 
borage Borago officinalis (10) and the nematode Cae- 
norhabditis elegans (11) and A-5 desaturases from C. 
elegans (12) and the fungus Mortierella alpina (13, 14) 
were identified. Mutual comparisons of their deduced 
amino acid sequences revealed the presence of highly 
conserved heme-binding motif and histidine boxes, lo- 
cated in same order, which appeared to be common in 
all desaturases of bacteria and plants (15). Taking 
advantage of the sequence informations on the desatu- 
rases in eukaryotes, we newly identified a rat liver 
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cDNA encoding functional A-6 fatty acid desaturase, as 
reported here. 

MATERIALS AND METHODS 

General laboratory chemicals were purchased from Katayama 
Chemical (Osaka, Japan). Fatty acid standards were from Sigma 
Chemical Co. (St. Louis, MO). Reagents and enzymes for genetic 
manipulations were from Takara Shuzo (Kyoto, Japan), otherwise 
stated. Male BALB/c mice were obtained from Charles River Japan 
(Hiroshima, Japan). 

Messenger RNAs were extracted from mouse liver by guanidinium 
thiocyanate method using QuickPrep mRNA Purification Kit (Am- 
ersham Pharmacia Biotech, Uppsala, Sweden). A cDNA pool was 
prepared from the mouse mRNAs by TimeSaver cDNA Synthesis Kit 
(Amersham Pharmacia Biotech) according to the manufacturer's 
instruction. Oligonucleotide primers of following sequences were 
synthesized for amplification of two apart regions of gene coding for 
desaturase-like protein by polymerase chain reaction (PGR); m3F, 
5'-GTCAGGGTGCTGGAGAGCCACTGG-3'; m3R, 5'-GTAGTGTAG- 
GCCGTGCTTCGCGC-3'; m5F, 5'-GATGCTACGGATGCCTTCCGT- 
GC^k mSR, 5'-TTCATGTCCTCAGCAGTCTTCTTC-3'. The cDNA 
frx>i^Pbuse liver was subjected to PCR reactions (LA PCR kit, 
Takara) with primer pairs, m3F and m3R, or m5F and m5R. Suc- 
cessfully amplified products, m3 and m5 respectively, were cloned on 
plasmid pGEM-T Easy Vector (Promega, Madison, WI), and the 
inserts_were confirmed by DNA sequencing analyses. 

Rat Uver cDNA library constructed on AZAP II (#937507, Strat- 
agene, La JoUa, CA) was probed with alkaline phosphatase-labeled 
m3 or m5 fragment, by which labeling of the probes, hybridization, 
and detection of hybrids were performed using AlkPhos Direct Sys- 
tem (Amersham Pharmacia Biotech). In this protocol, hybridization 
and washing steps were done at 55^*0. Plaques positively detected by 
either of the two probes were picked up, and the accuracy of the first 
screening was reevaluated with purified plaques by another round of 
plating and hybridization. The plasmids containing positive cDNA 
were recovered from selected A. clones by in vivo excision, and the 
insert was entirely sequenced on both strands of DNA. 

Since one of the positive clones, r24, seemed to contain full-length 
cDNA of interest, a plasmid derived from clone r24 was used as a 
template for PCR amplification of regions which were deduced as 
open reading frames. Because two ATG sequences could be consid- 
ered as putative translation initiation codons, two forward primers, 
r24aF, 5'-ACAAAGCTTATGG(K>AAGGGAGGTAACCAG-3' (corre- 
sponding to the first ATG indicated by boldface type) and r24bF, 
5'-(^«^AGCTTATGCCCACCTTCCGCTGGGAG-3' (corresponding 
to tii^PlcondATG indicated by boldface type) were used to amplify 
the coding frames, r24a and r24b, respectively, and to generate Hin 
dlll site (underhned) adjacent to the ATG. A reverse primer r24R, ' 
5'-TCTTCTAGATCATTTGTGGAGGTAGGCATC-3' (annealing to 
the complement of the stop codon indicated by boldface type) was 
used for each PCR reaction, generating Xba I site (underlined). The j 
PCR products treated with Hin dlll and Xba I were inserted respec- 
tively to the yeast vector pYES2 (Invitrogen, San Diego, CA). It was ' 
confirmed by DNA sequencing analyses that the entire and flanking 1 
sequences of the inserts were as we designed. Transfer of the con- ( 
structs into Saccharomyces cerevisiae ^train INVScl (Invitrogen) ( 
was done by the lithium acetate method, and recombinant yeast cells ^ 
were selected on uracil-deficient medium. The yeast cells were cul- 
tivated in a medium containing 4% raffinose, 0,7% yeast nitrogen ^ 
base without amino acid, 1% tergitol type NP-40, 20 fJLg/mi histidine, ] 
60 /xg/ml leucine, and 40 /xg/ml tryptophan at 28"C, overnight. The ] 
culture broth was supplemented with fatty acid substrate so as to be j 
a final concentration of 0.5 mM, followed by further cultivation until 
cell density reached at 5 X 10® cells/ml. The expression of the trans- J 
gene was performed by the addition of galactose to 2% (w/v) and an * 
additional cultivation for 10 hr. ] 
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Culture broths were harvested, and total intracellular lipids were 
extracted with a mixture of chloroform/methanol (2:1, v/v). The lipid 
fraction was subjected to methyl-esterification with 10% hydrochlo- 
ride in methanol. Fatty acid methyl esters were applied on a gas- 
liquid chromatograph (GC; model GC- 17A, Shimadzu, Kyoto, Japan) 
equipped with a TC-70 capillary column (GL Science, Tokyo, Japan) 
and a flame ionization detector. The condition for GC analysis was as 
previously described (16). GC-mass spectrometry (GC-MS) analysis 
of the fatty acid methyl esters was performed using a MS-BU20 
( JEOL, Tokyo, Jap£in) high-resolution mass spectrometer linked to a 
gas chromatograph (model MS-5890, Hewlett Packard) equipped 
with the TC-70 column as the sample inlet, and operated in the 
electron impact mode at 70 eV. Comparison of the mass spectra of 
authentic standards and interest peaks in total ion chromatogram 
was done by visual- and computer-based examinations. 

RESULTS 

A nucleotide sequence corresponding to the highly 
conserved region (indicated by dotted line in Fig. 1) in 
A-6 desaturase from C. elegans (11) was used as a 
query to search databases for related sequences in 
mammals. When the database of mouse expressed se- 
quence tag (EST) at DNA Data Bank of Japan was 
searched using both the BLAST and FASTA algo- 
rithms, several entries registering DNA sequences par- 
tially homologous to the query were retrieved. The 
nucleotide sequence in 5 '-region of one (GenBank ac- 
cession number W53753) of the ESTs was then used to 
further search the database and found to be partially 
overlapped with another EST clone (AA512429). By 
similar sequential searchs toward 5 '-end of a putative 
desaturase gene in mouse, a clone (AA036321) over- 
lapped with the AA5 12429 was found, and the 
AA036321 led us to an additional clone (AA250162), 
Nucleotide sequence of the AA250162 could be trans- 
lated to the amino acid sequence bearing partial re- 
semblance to that of N-terminal domain of previously 
characterized desaturases. 

Based on the sequence informations from ESTs 
W53753 and AA250162, we made two non- 
overlapping DNA fragments, m3 (3 '-region) and m5 
(5'-region), respectively, by PCR with a mouse liver 
cDNA pool as a template. These fragments were used 
as hybridization probes for isolation of entire coding 
region of desaturase gene from rat liver cDNA li- 
brary. We elected rat, instead of mouse, as a source 
of the target gene, because the desaturases had been 
best-characterized biochemically in rat, which in- 
cluded a report of partial purification of linoleoyl- 
CoA desaturase (17; see Discussion). The condition 
for hybridization was set at medium stringency mak- 
ing allowance for differences in animal species. As a 
result of screening, five individual clones were iso- 
lated as positives to the probe m3 and only one of 
them, termed r24, was hybridized also with m5 
probe. Sequencing of these clones revealed that the 
clone r24 had a cDNA insert of 1573 basepairs (bp) in 
length (GenBank accession number AB021980), and 
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.IG. 1. Composite alignments of the amino acid sequence deduced from the rat cDNA clone r24 vvith A-S desaturases from other sources 
Borage D6d, B.omcinalis A-6 desaturase (GenBank accession number U79010); C.elegans D6d. C. elegans A-G desaturase (AF031477). 
Nucleotide sequence corresponding to the highly conserved region (indicated by dotted line above the sequences) in C. elegans A-6 desaturase 
was used as a quer>- to search databases, as mentioned in the text. Identical residues are boxed, and the conserved heme-bmdmg motif and 
three histidine boxes are marked with single and double underUnes, respectively. 



cDNAs in remaining four clones (all of them are less 
than 5"00 bp in length) were corresponded to 3'- 
region of r24 (data not shown), supporting the failure 
of hybridizing with probe m5. A putative protein 
encoded by the clone r24 seemed to be a rat homolog 
of the protein from the mouse EST AA250162 (96.5% 
identity in 491 nucleotides overlap). Thus, we chose 
the clone r24 for further characterization. 

Two ATG initiation codons (nucleotide numbers 
97-99 and 148-150) were found in the sequence of 5'- 
terminal domain of the r24 cDNA and those were 
placed in-frame. According to the Kozak consensus se- 

«ice, AGXXATGG, that has been advocated as fa- 
d sequence for eukaryotic initiation sites (18), the 
first of the two initiation codons is credible. If this is 
the case, the cDNA contains a coding frame of 1335-bp 
long including a TGA termination codon (nucleotide 
numbers 1429-1431), which can be translated into 444 
amino acid polypeptide. Comparisons of the deduced 
amino acid sequence of r24 with A-6 desaturases from 
C. elegans and B, officinalis showed homology scores of 
27.9% and 26,4%, respectively (Fig. 1). It is noted in 
Fig. 1 that a typical heme-binding motif, HPGG (19), 
and three histidine boxes highly conserved within fatty 
acid desaturases (15) are present in the r24 sequence 
as well as the others. At the third histidine box, the 
first histidine residue in the conventional motif, 
HXXHH, was substituted with glutamine. This vari- 
ance had occured in A-6 and A-5 desaturases from 
fungus, plant, and lower animal (10—14), 



For functional analysis of the clone r24, two possible 
coding regions, named r24a and r24b (nucleotide num- 
bers 97-1431 and 148-1431, respectively) were ampli- 
fied by PGR, and respective expression plasmids were 
constructed on the yeast vector pYES2. The PGR prod- 
ucts were located at just downstream of the galactose- 
inducible GALl promoter on each construct. After ob- 
taining yeast transformants carrying pYES2/r24a, 
pYES2/r24b, or pyES2 (controb, cells were cultivated, 
supplemented by the addition of substrate LA (C18: 
2A-9, 12), and induced in the presence of galactose. 
Aliquots of cells in the induced culture broth were 
taken for analyses of the intracellular fatty acid com- 
position by GC, and the res;iltant chromatograms of 
fatty acid methyl esters were shown as Fig. 2. A novel 
peak, which was not apparent in the case of control 
(Fig. 2A), was detected in charts from both induced 
pYES2/r24a (peak 6 in Fig. 2B.> and pYES2/r24b (data 
not shown). Similarlj, when the substrate LA was re- 
placed with ALA (Cl3:3A-9, 12, 15), a peak additional 
to the background level in the control case (Fig. 2C) 
was found in a GC profile obtained from the yeast 
transformed with either pYZS2/r24a (peak 8 in Fig. 
2D) or pYES2/r24b data net shown). We confirmed 
that these and other addirlcnal peaks did not appear 
when the yeast carr:rLng pYES2/r24a or pYES2/r24b 
was not induced bv galactose or was supplemented 
with none of exogenous farty sjrlds. Comparisons of the 
retention times of the newly j/ielded peaks with those 
of authentic standards hav=: anticipated that the fatty 
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FIG. 2. GC analyses of methyl esterified fatty acids from the induced yeast cells containing pYES2 (A and C) or pYES2/r24a ^B and Dj. 
Before the induction with galactose, LA (peak 5 in A and B) or ALA (peak 7 in C and D) was added to be incorporated to the cells. The peaks 
that found only in the case of pYES2/r24a were indicated by arrows (peak 6 in B and peak 8 in D). Identities of other peaks were determined 
by comparing their retention times with those of authentic standards. Peaks 1, C16:0; 2, C16:lA-9; 3, C18:0; 4, Cl8:li^-9. 



acids giving the peaks 6 and 8 are GLA (C18:3A-6, 9, 
12) and cis-3, 6, 9, 12-octadecatetraenoic acid (CIS: 
4A-6, 9, 12, 15), which are the A-6 desaturation prod- 
ucts of LA and ALA, respectively. These prospects were 
positively supported by definitive assignments of the 
compounds in peaks 6 and 8 by GC-MS analyses (data 
not shown). In a separate experiment, when DGLA 
(C20:3A-8, 11, 14), a substrate of A-5 desaturase, was 
added to our expression system, no extra peak was 
observed in chromatograms from the r24a/r24b recom- 
binants, compared to the negative control. Taken to- 
gether, the recombinant yeast containing the inducible 
r24 cDNA had gained function of A-6, but not A-5, fatty 
aci^l^saturation. 

DISCUSSION 

Here we isolated a rat Uver cDNA coding for the A-6 
fatty acid desaturase. Although the cDNA, r24, was 
successfully expressed in yeast, we could not predict 
the actual ATG initiation codon corresponding to a 
methionine residue at the amino terminus of the native 
desaturase protein. This is because, in our study, no 
significant differences have not been detected between 
the two lines of expression analyses on r24a and r24b. 
This observation suggested no other than the needless of 
the first 17 amino acids in the protein expressed from 
r24a to function in yeast although this portion might be 
indispensable in rat. Another set of experiments includ- 
ing the purification of the native A-6 desaturase is essen- 
tial to clarify this point and is being undertaken. 



Okayasu, et al. (17) described a purification of rat 
liver linoleoyl-CoA desaturase that was capable of con- 
verting linoleoyl-CoA to 7-linolenoyl-CoA in vitro. The 
apparent molecular weight of this enzyme (66 kJD) ob- 
viously differs from either molecular weights calcu- 
lated from the deduced amino acid sequence of r24a 
(52.4 kD) or r24b (50.7 kD). This inevitably raises a 
possibility of the presence of more than two types of the 
enzymes taking charge of the A-6 fatty acid desatura- 
tion. Sprecher and his colleagues have proposed a 
novel pathway, docosapentaenoic acid to docosahexae- 
noic acid via A-6 desaturation, for the biosynthesis of 
polyunsaturated fatty acids (20, 21). However, the pu- 
tative involvement of a single cycle of peroxisomal 
/3-oxidation in this pathway is under a critical reeval- 
uation, excluding also the necessity of the proposed A-6 
desaturation step (22). A metabolic study by Chris- 
tiansen, et al. (23) suggested that liver microsomes 
might contain separate enzymes for desaturation of LA 
and ALA. Their observations, however, seem to be in- 
consistent with a result of competitive study using 
fatty acid tracers (24), implying that a single enzyme 
may govern d^saturating fatty acids at A-6 position. To 
date, no clear conclusions have been made whether 
multiple forms of A-6 desaturase exist. 

In relation to these pending questions, we are at- 
tempting to isolate and characterize a full-length 
cDNA corresponding to the EST clones W53753, 
AA5 12429, and AA036321 since nucleotide sequences 
of these clones can be translated into amino acid se- 
quences that are significantly homologous, but not 
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identical, to the sequence from our clone r24 (data not 
shown). This gene may encode an isoform of the A-6 
desaturase, which is dominantly expressed in tissues 
other than liver or at the different developmental 
stages. This assumption does not contradict the facts 
that we were unable to isolate a Uver cDNA whose 
sequence is matched with the probe m3 (from W53753), 
and these ESTs are derived from embryo and mam- 
mary gland. Otherwise, a protein encoded by this gene 
may be one of other desaturases, for example, A-5 
desaturase which has not yet been identified in mam- 
mals. In either case, the cloning of the mammalian 
desaturase gene(s) will accelerate to elucidate the mo- 
lecular mechanisms on the regulation of various cellu- 
lar events by the enzyme possibly through the alter- 
ation of physical state of membrane lipids and of the 
level of pooled precursors for signal transducers. 
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Arachidonic acid (20:4 (n-6)) and docosahexaenoic acid 
(22:6(n-3)) have a variety of physiological functions that 
include being the msgor component of membrane phos- 
pholipid in brain and retina, substrates for eicosanoid 
production, and regulators of nuclear transcription fac- 
tors. The rate-limiting step in the production of 20:4(n-6) 
and 22:6(n-3) is the desaturation of 18:2(n-6) and 
18:3(n-3) by A-6 desaturase. In this report, we describe 
the cloning, characterization, and expression of a mam- 
A-6 desatwase. The open reading frames for 
^Buse and human A-6 desaturase each encode a 444- 
amind acid peptide, and the two peptides share an 87% 
amino acid homology. The amino acid sequence predicts 
that the peptide contains two membrane-spanning do- 
mains as well as a cytochrome &5-like domain that is 
characteristic of nonmammalian A-6 desaturases. Ex- 
pression of the open reading frame in rat hepatocytes 
and Chinese hamster ovary cells instilled in these cells 
the ability to convert 18:2(n-6) and 18:3(n-3) to their re- 
spective products, 18:3(n-6) and 18:4(n-3). When mice 
were fed a diet containing 10% fat, hepatic enzymatic 
activity and mRNA abundance for hepatic A-6 desatu- 
rase in mice fed com oil were 70 and 60% lower than in 
mice fed triolein. Finally, Northern analysis revealed 
that the brain contained an amount of A-6 desatxu-ase 
mKNA that was several times greater than that found in 
other tissues including the liver, lung, heart, and skel- 
etal muscle. The RNA abundance data indicate that 
prior conclusions regarding the low level of A-6 desatu- 
rase expression in nonhepatic tissues may need to be 
reevaluated. 

^ftiong chain polyunsaturated fatty acids such as 20:4(/i-6) and 
22:6(n-3) play pivotal roles in a number of biological functions 
including brain development, cognition, inflammatory re- 
SF>onses, and hemostasis (1-4). Over 30% of the fatty acid in 
brain phospholipid consists of 20:4(n-6) and 22:6(n-3), and ap- 
proximately 50% of the fatty acid in the retina is 22:6(^-3) (5, 
6). An inadequate availability of 20:4(71-6) is associated with 
impaired nerve transmission, reduced eicosanoid S3nnthesis, 
and impaired fetal growth (7-9). Recently, premature infants 
were found to have reduced cognitive development, apparently 
because they could not synthesize adequate quantities of 22: 
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6(71-3) to meet the biological demands for proper retina function 
(1, 10). In addition to being vital components of membrane 
phosphohpids and functioning in key steps of cell signaling, 20- 
and 22-carbon polyunsaturated fatty acids govern the expres- 
sion of a wide array of genes, including those encoding proteins 
involved with lipid metaboHsm, thermogenesis, and cell differ- 
entiation (11—14), 

The availabihty of 20 - and 22-carbon (71-6) and (7i-3) poly- 
enoic fatty acids is greatly dependent upon the rate of desatu- 
ration of 18:2(71-6) and 18:3(7i-3) by A-6 desaturase (15). A-6 
Desaturase is a microsomal enzyme (15) and is thought to be a 
component of a three-enzyme system that includes NADH- 
cytochrome 65 reductase, cytochrome 65, and A-6 desaturase 
(16). The enzjrmatic activity for A-6 desaturase is reportedly 
low in most tissues except the Hver (16). Consequently, the 
liver has been considered the primary site for the production of 
long chain polyenoic fatty acids (17, 18). Numerous dietary 
studies indicate that hepatic A-6 desatiurase activity is induced 
by diets low in essential fatty acids and suppressed by diets 
rich in vegetable or marine oils (19, 20). In addition, A-6 de- 
saturase activity is induced by peroxisome proliferators and by 
the administration of insulin to diabetic rats (21, 22). Unfortu- 
nately, defining the molecular determinants of A-6 desaturase 
actiWty, as well as characterizing its developmental pattern 
and tissue distribution, has been hampered by the fact that A-6 
desaturase has been neither cloned nor reproducibly purified. 
Therefore, our objective was to clone the mammalian A-6 de- 
saturase and utilize the cDNA to examine the tissue distribu- 
tion and nutritional regulation of A-6 desaturase mRNA. 

EXPERIMENTAL PROCEDURES 

Cloning of the Mouse A-6 Desaturase cDNA — A murine cDNA (Gen- 
Bank accession number W53753) displaying high homology to the 
amino acid sequence of A-6 desaturase from Synechocystis sp. was 
acquired and sequenced. Subsequently, a 23-base oligonucleotide 
primer (5'-CTTGGCATCGT(KKXJAAGAGGTG-3') speciSc for the 5' 
end of murine cDNA V.^53753 was synthesized and utilized to screen a 
mouse adaptor-ligated liver cDNA library (Marathon-Ready cDNA; 
CLONTECH) by rapid amplification of cDNA ends-PCR.' The PGR 
conditions consisted of an initial denaturation step of 94 *C for 1 min, 
followed by 5 cycles of 94 "C for 10 s and 72 *C for 4 min, 5 cycles of 
94 *C for lb s and 70 "C for 4 min. and, finally. 20 cycles of 94 for 10 s 
and 68 *C for 4 min. The resulting rapid amplification of cDNA ends- 
PCR product was cloned into pBluescript (Stratagene) and sequenced 
by the dideoxy chain termination method (23). 

The nucleotide sequence of the PGR product was utilized to BLAST 
search the mouse EST database. Two mouse cDNAs (GenBank acces- 
sion numbers AA237892 and AA250162) possessing 100% nucleotide 
homology with our PGR product were identified and acquired from 
Genome Systems. Clone AA250162 contained two possible AUG start 
codons, and the EST cDNA AA237892 contained an apparent stop 
codon. The two EST cDNAs were fused at the Styl restriction site, and 



^ The abbreWations used are: PCR, polymerase chain reaction; ORF, 
open reading frame; EST, expressed sequence tag; CHO, Chinese ham- 
ster ovar>'; kb, kilobase pair(s). 



This paper is available on line at http://www.jbc.org 



471 



172 



Mammalian A-6 Desaturase 



.ho product was insorlod into the cytomegalovirus promoter expression 
.ector pcDNA3 1 (Invitmgen). Sequence analysis and prediction of 
.mino acid sequence were performed using MacDNASlS pro (Hitachi), 
.ind a translation initiator codon was determined based on Kozak s rule 
24) 

aoninfT of the Human ^ 6 Dcsaturase cOAA-Using the nucleotide 
sequence of mouse liver ^'6 dosaturase, the EST human database was 
searched for a human homologiic cUNA. The searcJi identified a highly 
homologous human brain EST cDNA (GenBank accession number 
Z44979), which was purchased from Genome Systems and sequenced 
for verification. The 5* end of the human cDNA was extended by PGR 
using a human brain cDNA library (Marathon-Ready cDNA; CLON^ 
TECH). The forward oligonucleotide primer • o'-AGAGTGGCAGCAT- 
GGGGAAG-3') was prepared using the 5' end of the mouse A-G dcsatu- 
rase and designed to include the putative start codon. T)ie reverse 
primer (5'-GATGGTGGGGAAGAGGTGGTG-3- ) was prepared from the 
sequence derived from human Z44979 cDNA. 

Expression of the Mouse ^-6 Dcsaturase—CeUular expression of the 
mouse A-6 desaturase was performed in rat primary hepatocytes and 
CHO cells. Rat primar>- hepatocytes were isolated by collagenase per- 
fusion and allowed to attach to a collagen -coated 60-mm culture plate in 
3 ml of Waymouth 752 medium supplemented with 0.5% fetal bovine 
serum and 1 /x-M insulin and dexamethasone (25). After a 6-h attach- 
merJl^od, the cells were washed with phosphate-buffered saline and 
tranWBed with 6 fig of the mouse A-6 desaturase expression plasmid 
or the pcDNAS.l expression vector alone using 6.6 ^1 of Lipofectin per 
/xg of DNA (Life Technologies, Inc.). Transfection was conducted by 
adding the mixture of Lipofectin and DNA in the absence of fetal bovine 
scrum. After the 12-h transfection period, the medium was replaced 
with the one containing either 200 /x.M albumin-bound 18:2, n-6 (molar 
ratio of fatty acid to albumin, 4:1) or albumin alone. CHO cells were 
grown in Kaighn's modification of Ham's F-12 medium supplemented 
with 10% fetal boWne serum in a 25-cm2 flask. At 80% confluence, the 
serum-containing medium was removed, and cells were washed with 
phosphaLe-bufi"ered saline for transfection. A mixture of 2 ^xg of the 
mouse A-6 desaturase expression plasmid, 12 fxl of LipofectAMINE, and 
8 m1 of Plus reagent (Life Technologies, Inc.) was added to cells without 
serum for 4 h. Subsequently, 10% serum was added to the transfection 
media for 8 h. After a total i2-h transfection period, the CHO cells were 
treated with either 200 albumin-bound 18:3(/i-3). 20:3(rt-6), or albu- 
min alone. The hepatocjtes and CHO cells were incubated with the 
treatment medium for 24 h and then used for fatty acid analysis. 

Fatty Acid Extraction and Ana/ysts— Cellular fatty acid was ex- 
tracted by saponif>-ing fatty acids using 1 ml of 30% KOH and 1 ml of 
ethanol. Fatty acids from the treatment medium were also extracted 
and analyzed after 24 h of incubation. Heptadecanoic acid was added to 
the saponification mixture as an internal standard. After saponifica- 
tion, the nonsaponifiable lipids were removed by extraction with petro- 
leuj^her. Subsequently, the solution was acidified, and the fatty acids 
wel^Pltracted w-ith petroleum ether. The extract was dried under 
nitrogen, and the residue was methylated using 14% boron trifluoride 
in methanol (Sigma). Methylated fatty acids were separated and quan- 
tified by gas chromatography using a fused siUca glass capillary column 
(50 m X 530 /zm internal diameter; Quadrex'. The column temperature 
program was composed of an initial hold at 140 *C for 5 min, ramping 
at 5 "C per min to 220 =C, and a final hold at 220 *C for 7 min. The 
injector temperature was 250 ^'C, and the flame ionization detector 
temperature was 260 'C. , r» t d/ 

Nutritional Regulation of 1-6 Desaturase Expression—Male BALB/c 
mice were fed a high-glucose, fat-free diet for a i -day adaptation period. 
After this period, the fat-free diet was supplemented with either 10% 
corn oil or 10% triolein ^Sigma; 99% purity), and the mice (/i = 4 
mice/group) were fed for an additional 5 dzyi. Liver tissues were re- 
moved, and microsomes were isolated by ditTerential centrifugation. 
One g of liver was homogenized in 4 ml of homogenization buflcr 
containing 50 m.M potassium phosphate. pH 7.4, and 0.25 .m sucrose. 
After a lO-min centrifutraiion of the homogcnate at 10,000 x g, the 
resulting .supornatani wa.^ spun at 100.000 ■ g for 60 min to isolate a 
microsomal pellei. A\~icr re.suspending the pti-llet in hontogenization 
hnfler. .'3 rnt^ of inicro.=o:iia! pn^lein wure incubatod in a 'M °C .shaking 
water batli for 5 min wiih 1 ini of reaction niixturu including 1.2 m.M 
N.-\r)H. 'i.(> mM ATP. 0.*-> m.M coenzyme A. 4.8 mM MgCi.j, 72 m.M 
l)lM).<pli;H.r; bu!T..r. pH 7.4. and 50 nmol of 1 - • 'O-labr^led l.S:2(/(-(Ji. The 
reaction wa.-^ <v>pi^.'<\ by adding saponification reagi.-nl, and fatty acids 
wen* sa()<>nifitd and n:clhylaU:d as descriixd aljov.j. Karlioaciive IH: 
2'//-(i» and 1 .-^rii- /»-6 • vw-.-v se-paraleti by .silv-.-r iiiiratf -imp>'.'g'»:>ifd thin 
layr rhrMni:ilo-r;jphy Th- radiuari i viJ y v.;:- .pjantifief) usiuf^ an Am- 
hi's raili.>-iin;.i:.T I».-a;nrasi- enzyiii'- ;:.;:ivity is ♦'.Mpres.^e^l as \h" 



percentage of 18:2(«-6» converted to lS:3ui-6) per mg of protein/min. 

KNA Extraction and Northern Blot A/m/y.sj.s— Total RNA was iso- 
lated from the liver of mice in the dietary study using the phenol- 
guanidinium isothiocyanate method (26). Twenty Mg of total IINA were 
size-fractionated on a 17c formaldehyde gel and then transferred to a 
Zeta probe nylon membrane (Bio-Had). The mou.^e A-6 desaturase probe 
was prepared by incorporating T'-PldCTP by PCK. The forward primer 
was 5'-GGACATAAAGAGCCTGCATG-3'. and the reverse primer was 
5'-ACTGGAAGTACATAGGGATG-3'. The Northern membrane of hu- 
man tissues was purchased from Invitrogen. The radiolabeled probe for 
the human tissue blot was a 200-base pair F*CH fragment of human A-6 
desaturase using primers of S'-GGCAAGAACTCA-^AGATCAC-S' and 
5 -GAGAGGTAGCAAGAACAA.'\G-3'. The autoradiographic signal was 
quantified using Instant Imager I Packard*. 

RESULTS 

Cloning Mouse and Human A-G Desatu rose—The mouse EST 
database was searched for mammalian homologues using the 
amino acid sequence for A-6 desaturase from the photosyn- 
thetic cyanobacterium Synechocyslis sp. (27). A mouse cDNA 
that had a 609c similarity to a 46-amino acid sequence of 
Syneckocystis A-6 desaturase was identified. A 150S-base pair 
cDNA sequence for mouse liver A-6 desaturase was acquired 
using a combination of ligation-mediated PCR screening of a 
mouse liver cDNA library and BLAST searches of the' mouse 
EST database (Fig. lA). Sequence analysis revealed the pres- 
ence of two in-frame methionine codons located at positions 75 
and 126. In addition, a TGA termination sequence was identi- 
fied at position 1407. Kozak's rule, which predicts that the 
favored eukaryotic translation initiation sequence resides in 
the sequence of AXXATGO (24), indicated that the fu^t of the 
two ATG codons was the preferred initiation codon for the 
putative A-6 desaturase. The apparent ORF between the 6rst 
ATG codon and the TGA termination codon predicted a peptide 
consisting of 444 amino acids and having a size of 52.2 kDa. 
The human brain cDNA homologiae for A-6 desaturase contains 
an initiation codon and a termination codon that are perfectly 
aligiied with the initiation and termination codons of the mouse 
cDNA (Fig. LA). Moreover, the amino acid sequence derived 
from the ORF revealed that 87% of the amino acid sequence for 
the mouse and human homologues was identical, and 96% of 
the sequence had similarity (Fig. LB). A search of the S\viss 
Protein Database indicated that the putative A-6 desaturase 
sequence was unique and shared ver\' little amino acid homol- 
og>' with any other mammalian proteins including the murine 
stearoyl-CoA desaturase (A-9 desaturase) (2S>. 

Structural Characteristics of Mammalian 1-6 Desaturase— 
The enzymatic activity of mammahan A-6 desaturase is asso- 
ciated with the microsomal membrane fraction (15). Consistent 
with such membrane involvement, the predicted ammo acid 
sequence for A-6 desaturase indicated that the peptide contains 
527c nonpolar amino acids, and a hydropathy profile revealed 
the presence of two membrane-spanning domains that are 
characteristic of membrane-anchored proteins (Figs. \B and 2». 
In addition, the amino terminus of the A-6 desaturase peptide 
contains a hydrophilic domain of 54 amino acids that is highly 
homologous with the hemc-binding domain of cytochrome 65 
(Fig. 3A). This cvtochromc domain is also found in the 

A-Crdesaturases "from Bora^o officinalis (29 r and Caenorhobdi- 
tis elegans (30* (Fig. 3.4). The His^^ and Hi.r'' residues located 
within this domain of the mammalian A-6 desaturase are ex- 
actlv aligned with the two heme-binding histidines in cyto- 
chrome hr, (31 J. Moreover, these two histidines are surroundc^i 
by charged amino acid.^ thai may contribute lo the stabilization 
of llio henie-histidine comiile.x f3U. In addiuon. the sequence 
•'■Mn'GG"'' pn-dici.^ the c.xi.nence ofa dramaiic /Murn that may 
render ]\\<''^ more accossibl'j to heme irf^n Ijinding '31). 

A socond n<;t<-v.-nrthy f./auwx- ..f the ma /.-..aahan ^-o d';.^atu- 
ra.^t; is ihr pr^ S' iio- of thn *,- iii.stidin<-ri';l. :eKnon.- • I- iK- -'i^ 
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Fig. 1. Alignment of the predicted 
amino acid sequences for mouse and 
human A-6 desaturase. A, schematic di- 
agram of the ORF and untranslated re- 
gions for mouse and human A-6 desatu- 
rase. The hatched box indicates an ORF of 
1332 nucleotides^ and the open boxes rep- 
resent untranslated regions. The human 
cDNA contains a polyadenylation signal 
AATAAA at the 3* end. D, a comparison of 
the amino acid sequences for mouse and 
^^ nan A-6 desaturase predicted by the 
iHkeotide sequence of the ORFs. Both 
- alKse and human ORFs encode 444 
amino acids. Identical amino acids are 
paired by vertical lines, and conserved 
amino acids are matched by colons. The 
C3rtochrome 65-Hke domain is underlined. 
Transmembrane domains are shown in 
shaded areas, and three histidine-rich do- 
mains are in bold. 
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H^ons I (HXgH) and II (HX2HH) are located between the two 
Vmismembrane domains, and region III (HH) is located near 
the carboxyl terminus of the peptide. These histidine-rich re- 
gions are also found in plant membrane desaturases and mam- 
malian stearoyl-CoA desaturase and reportedly bind non-heme 
iron that is required for enzymatic activity (32). 

Expression of Desaturase — ^The predicted structural 
characteristics of the mouse and human peptides strongly sug- 
gested that the cDNAs did in fact correspond to mammalian 
A-6 desaturase. To confirm this conclusion, the ORF for the 
mouse A-6 desaturase was expressed in primary cultures of rat 
hepatocj^s and in CHO cells. Fatty acid analysis revealed that 
hepatocj^s transfected with the vector containing the A-6 
desaturase ORF were capable of synthesizing the A-6 desatu- 
rase product l8:3(n-6) from 18:2(/i-6) (Fig. 4A). On the other 
hand, hepatocytes transfected with vector alone produced no 
detectable 18:3(^-6) product (Fig. 4B». Similarly, CHO cells 
expressing A-6 desaturase readily converted the A-6 desaturase 
substrate 18:3(n-3) to the A-6 desaturase product 18:4(n-3j>, 
whereas nontransfected CHO cells were unable to produce 
detectable levels of 18:4(n-3) (Fig. 4, C and D). In contrast, 
providing CHO cells with the A-5 desaturase substrate 20: 
3(n-6) did not lead to the production of the A-5 desaturase 



product 20:4(71-6) (data not shown). These data conclusively 
demonstrate that the mouse and human ORFs do in fact encode 
pnammalian A-6 desaturase. 

Nutritional Regulation of A-6 Desaturase Expression — The 
enzymatic activity of A-6 desaturase increases when animals 
are fed an essential fatty acid-deficient diet, whereas it de- 
creases when polyunsaturated fatt3- acids are ingested (16, 19, 
20). Using the mouse cDNA for A-6 desaturase, we have found 
that the suppression of hepatic A-6 desaturase enzymatic ac- 
tivity associated with the ingestion of poljomsaturated fat (Le. 
corn oil) is paralleled by a comparable reduction in A-6 desatu- 
rase mRNA abundance (Fig. 5, A and B). Interestingly, 
whereas the dominant transcript of hepatic A-6 desaturase is 
approximately 4.0 kb in size, the mouse Uver also contains a 
minor transcript that is approximately 2.2 kb (Fig. 5B). Both 
transcripts appeared to be suppressed by dietary com oil to the 
same degree. In addition, hybridizing the Northern blot with 
sequences from the 5', middle, and 3' region; of the A-6 desatu- 
rase ORF yielded the same outcomes with respect to the abun- 
dance and dietary response of the 2.2-kb transcript (data not 
shown). The reason for these two different transcripts remains 
unknown. 

A-6 Desaturase mRNA Distribution in Human Tissues — 
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Fig 2. Hydropathy profile of mouse 
(A) and human (B) A-6 desaturase. 
The hydropathic pattern for A-6 desatu- 
rase was plotted using the method of 
Kyte-Doolittle, and the amino acid se- 
quences were predicted by the respective 
ORFs Bors.the transmembrane regions. 
Boxed H, locations of histidine-nch 
regions. 




100 



200 300 
Number of amino acids 



B 




"100 200 300 

Number of amino acids 



A 

D6D 



Cyto.bS 



House 

Human 

C. elegans 

Borage 

House 
Human 



LRTDR»I*VIDRK!J6pVJPKWSQ 
LRTDRtLV I DR|^p igrKW S I Q 
ASGLRHKVDGKWLBpSEELVK 

12 -dSi-knIIIdkpgdlwisiqgraHdvsdwvkd 

^^-^-DSKSTgVILH 



24 -EEiOL. 

24 -EEI«; 

1 -MVVDl 



15 -bexq: 



HSKST^ILHH|^gDlAKFLEE| 




RVIGHYSGED. 
g^RVlGHYAGE' 

yjEQYRH 
SSFPLKSLAGQE 

I--- fil 
EyXREQAGG] 

EVLREQAj 




B 



D6D 



SCD 



D6D 



Mouse 
Human 
C.elegans 
Borage 
Syn . sp 

Mouse 
Human 



Mouse 

Human 

C.elegans 

Borage 

Syn.sp 







Region 1 


176 


- GWLQ 




DYG-G 


176 


. GWLQ 




DYG-G 


164 


- GWLT 




EFC-E 


155 


- GWIG 




DAG-I 


64 


- FNVG 




DAN-| 


112 


- TAGA 




RLWS 


lie 


- TAGA 




RLWS 



Region II 



.SVYKKSIWNHVVHKFVIGHLKQASANW^NH 
.SVYRKPKWHHLVHKFVIGHLKGASANW^H 
JQPTKNRPLNDTISLFFGNFLQiGFSRDWgKD 
fMVVSDSRLKKFMGIFAANCLSGlSIGWW^KWN 
^AYS?NPEIHRVLGMT--YDFVGLSSFLWRY° 

JTYKARLPLRIFLIIANTHAFQNDVYD-WAR 
ISYKARLPLRI.FLIIANTHAFQNDVYE-|AR 




KFSE- 161 
KFSE- 165 



Region III 



373 - DWFSGHiUNFQIEl 
37 3 - DWFSGHiNFQTEt 

374 - DWLWGGLNYQIEg] 
3€4 - DWFHGGLQFQIEg 
294 - NWFCGGLNHQVTr 



^FgTMPRHN - 395 
Tf^I^TMPRHN - 395 
.FPTMPRCN - 396 
•?;^,KMPRCN - 386 
iFPNiCHIH - 316 



SCD 



House 
Human 



265 - s lgavgegfhnyP 

289 - S LGAVGEGFHNYy 



Mr. 

m 



Mi 



rFPFDYSAS - 307 
3FPYDYSAS - 311 

nuzndii 

3 A comparison of the cytochron.e ..-U.e and '''^"f'^-^rrmraH^^^^ ^^^^^t^li 
A. rcomparison of the cytochrome b.-like domain for ^^^^^^^^^.l^'^'^^^uTB^i. 29), and C. elegans <30) A-6 desaturases rcveais a high 
s^uence within the c>-tochrome 6,.like domam cv^^W^ (C> 0,65) (31). Amino acids that are 'd«"''^f>^''*!^" ^ 

, highlighted in dark gray. 



1 



Mammalian Desaturase 
B 



475 



c 

8. 



Q 



D6D + 18:2 (n-6) 




o 



Q 



Vector + 18:2 (n-6) 




Retention time 



Retention time 



Q 



D6D + 18:3 (n-3) 

18:3 (n-3) 








9 


11.02 I 




\ \ 12.92 1 


18:4 (n-3) 


/ \ 12.07 1 \ \ 


14.98 



c 
o 



4> 



Non-transfected + i 8:3 (n-3) 



18:3 (n-3) 




Retention time 



Retention time 



Fig. 4. Expression of mouse A-6 desaturase in rat hepatocytes and CHO cells. A shows the conversion of 18:2(n-6) to 18:3(n-6) by 
hepatocytes that were transfected with the pcDNAS.l vector containing the mouse A-6 desaturase (J)6D) ORF, When hepatocytes were transfected 
with pcDKAS.l vector lacking the A-6 desaturase ORF, there was no detectable conversion of 18:2(n-6) to 18:3(rt-6) (B). The media of CHO cells 
incubated with alb\miin-bound 18:3(n-3) and transfected with the A-6 desaturase expression vector pcDNAS.l contained 18:4(n-3) (C>, whereas the 
media of nontransfected CHO cells contained no detectable 18:4(7i-3) (D). Retention times for the fetty acids are shown above the respective peaks. 
The identity of each peak was confirmed using individual fatty add methyl ester standards. 



Northern analysis of A-6 desaturase expression revealed that 
human A-6 desaturase mRNA is a single transcript of approx- 
imately 3.2 kb and is expressed in a wide array of tissues 
including the brain, hver, lung, and heart (Fig. 5C). The level of 
A-6 desaturase mRNA in the liver was approximately the same 
as that foimd in the lung and heart, but the abundance of A-6 
desaturase in the human brain was severalfold higher (Fig. 
5C). In addition to the tissues examined by Northern analysis, 
a search of the EST database revealed that A-6 desaturase 
mRNA is expressed in the human fetus and fetal he:Tt as well 
as in the 13 -day-old mouse embryo heart. 

DISCUSSION 

The purification and characterization of mammalian A-6 de- 
saturase have been difTicult because of its instability. In fact, 
there has been only one report, in 1981, that describes the 
purification of a putative linoleoyl-CoA desaturase from rat 
liver < 33 K Because of the problems encountered in the purifi- 
cation of A-6 desaturase, we have used the EST database to 
clone and characterize the mouse and human A-6 desaturase 
enz\Tne. Interestingly, a comparison of the rat liver linoleoyl- 
CoA desaturase with the A-6 desaturase peptide predicted by 
the ORF of both the mouse and human cDNAs indicates that 
the two proteins are markedly different. First, the ORF for 



mouse and human A-6 desaturase predicts a protein that is 
52.2 kDa, whereas the size of the linoleoyl-CoA desaturase was 
cited to be 66 kDa (33). Second, the nucleotide sequence of the 
mouse and human A-6 desaturase ORFs predicts that these 
peptides contain 30 histidines (Fig. LB). Moreover, many of 
these histidines are organized into distinct histidine-rich do- 
mains. Such domains are characteristic of all membrane-asso- 
ciated desaturases (32). In contrast, the reported amino acid 
composition of linoleoyl-CoA desaturase indicates that it con- 
tains only 15 histidine residues (33). Unfortunately, sequence 
information for linoleoyl-CoA desaturase is not available, be- 
cause the purification of linoleoyl-CoA desaturase has never 
been replicated since the initial report. Clearly, the A-6 desatu- 
rase and the putative linoleoyl-CoA desaturase are distinctly 
different proteins. It is possible the liver contains two A-6 
desaturase enzymes. In fact, metabolic studies suggest that 
there may be two isoforms of A-6 desaturase (34, 35): (a) one 
that catalyzes the initial desaturation of 18:2(^-6) or 18:3(/i-3), 
and (6) another that catalyzes the conversion of 24:5(^i-3) to 
24:6(/i-3), The cloning of the A-6 desaturase should now permit 
us to determine whether isoforms of A-6 desaturase do exist. 

In addition to the histidine-rich domains, the mammalian 
A-6 desaturase contains a distinct c3rtochrome 65-like domain 
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Fig. 5. Nutritional regulation and tissue distribution of mam- 
malian A-6 desaturase. Mice were fed a high-glucose diet containing 
10% corn oil or 10% triolein. Hepatic A-6 desaturase activity, which is 
expressed in A 35 means ± S.E., was significantly lower in mice fed com 
oil than in mice fed triolein (p < 0.001). The abundance of hepatic A-6 
desaturase mRNA was determined by Northern analysis (B). The abun- 
dance of the 4.0- and 2.2-kb A-6 desaturase transcripts was quantified 
by radio-imaging. The cpm of the ^^P-labeled probe associated with the 
4.0- and 2.2-kb transcripts was 2509 ± 154 and 327 ± 17; it was 1264 ± 
66 and 185 * 9 for the triolein and com oil groups, respectively (p < 
Q0|3 ). C depirts the abundance of A-6 desaturase mRNA found in a 
^flty of adult male human tissues. Each lane contains 20 ^ig of total 
rRa. Unlike mice, only one A-6 desaturase transcript vAth an approx- 
imate size of 3.2 kb was detected in human tissues. Comparable results 
were obtained with three different Northern blots. 

that is also characteristic of plant {Borage) and C. elegans A-6 
desaturases (29, 30) but is not a component of the mammalian 
A-9 desaturase (36). Early reconstitution studies with A-9 de- 
saturase indicated that the conversion of 18:0(n-9) to 18:l(/i-9) 
required A-9 desaturase, cytochrome ^5 reductase, and cyto- 
chrome 65 itself (36). It has been assumed from these early 
studies that all mammalian desaturases require cytochrome 65 
for enz\Tnatic activity ( 16, 36). However, the cytochrome 65-like 
domain of yeast OLEl was recently reported to replace the 
requirement for cytochrome 65; i.e. desaturation occurred in the 
absence of c>*tochrome 65, and removal of the cytochrome 65- 
like domain rendered the OLEl enzyme inactive (37). This 
obser\-ation raises the possibility that cytochrome 65 reductase 
transfers electrons to the catalytic domain of the A-6 desatu- 
rase via the cvlochrome 6»j-like domain, and not via cytochrome 
65 per se. 

Hepatic A-6 desaturase enzymatic activity varies with hor- 
monal and nutritional manipulalion (15, 16, 20, 38). For exam- 
ple, insulin deficiency and fasting reduce A-6 desaturase enzy- 
matic activity, whereas the administration of insulin or 



refeeding increases its activity (39). In addition to being af- 
fected by fasting and feeding, hepatic A-6 desaturase enzymatic 
activity is highly dependent upon the composition of dietary fat 
(16). Specifically, the ingestion of fats that are low in essential 
fatty acids {e.g. butter) results in higher levels of enzyme ac- 
tivity than the consumption of fats (e.g. corn oil) that are rich in 
essential fatty acids (16). Northern analysis indicates that the 
increase in hepatic A-6 desaturase activity associated with the 
consumption of an essential fatty acid-deficient diet is paral- 
leled by a comparable increase in the hepatic abundance of A-6 
desaturase mRNA (Fig. 5). Thus, it appears that the activity of 
hepatic A-6 desaturase is largely regulated by pretranslational 
events. However, this may not be the case in all tissues. Spe- 
cifically, A-6 desaturase activity is reportedly very low in non- 
hepatic tissues (16-18). Because of this low enzymatic activity 
in nonhepatic tissues, the liver has been considered to be the 
primar>' site of 20:4(n-6), 20:5(ai-3), and 22:6(n-3) production for 
peripheral tissue utilization (17). However, Northern analysis 
of RNA from a number of different human tissues challenges 
this concept (Fig. 5C). For example, the level of A-6 desaturase 
mRNA in the human liver was comparable to that found in the 
human lung and heart. Moreover, the abundance of A-6 desatu- 
rase mRNA in the adult human brain was severalfold greater 
than that in the human liver (Fig. 5C). This high level of 
expression is certainly very consistent with the fact that >30% 
of the human brain phospholipid consists of 20- and 22-carbon 
polyenoic fatty acids (5, 6, 40). However, such high expression 
is in conflict with the reports that brain microsomes have a rate 
of A-6 desaturation that is only 10-15% of that foimd in the 
liver (18, 41). These data suggest that A-6 desaturase enzy- 
matic activity may be determined by tissue-specific mecha- 
nisms that involve both pre- and post-translational events. 

In conclusion, A-6 desaturase catalyzes the rate-limiting step 
in the conversion of 18:2(n-6) and 18:3(n-3) to the long chain 
polyenoic fatty acids 20:4(n-6) and 20:5(n-3) and 22:6(/i-3), re- 
spectively (15). These long chain polyenoic fatty acids are es- 
sential for a large number of biological functions including 
inflammatory responses (4), brain development (2), retina func- 
tion and cognition (1, 3), signal transduction (42, 43), reproduc- 
tion (4), fetal growth (9), cell differentiation (14), and gene 
regulation (11-13). Not surprisingly, physiological conditions 
that are associated with low levels of A-6 desaturase activity 
may have a pronounced impact on a wide array of biological 
functions. For example, an impaired conversion of 18:2(n-6) to 
18:3(/i-6) appears to be associated with reduced nerve conduc- 
tivity in human diabetics (7). Similarly, the low rate of 18; 
3(n-3) conversion to 20:5(/i-3) and 22:6(n-3) observed in newr 
bom infants is highly correlated with impaired retina function 
and reduced cognitive development (1). Now that the A-6 de- 
saturase has been cloned, we can begin to define the role that 
A-6 desaturation may play in an apparently wide array of 
physiological processes. 
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ABSTRACT y-Linolenic acid (GLA; C18:3 A*''"^ is a 
component of the seed oils of evening primrose {Oenothera 
spp.), borage (Borago officinalis L,), and some other plants. It 
is widely used as a dietary supplement and for treatment of 
various medical conditions. GLA is synthesized by a A*-fatty 
acid desaturase using linoleic acid (C18:2 A^ '^^ ^ substrate, 
"^o enable the production of GLA in conventional oilseeds, we 
>ive isolated a cDNA encoding the A<^-fatty acid desaturase 
rem developing seeds of borage and confirmed its function by 
expression in transgenic tobacco plants. Analysis of leaf lipids 
from a transformed plant demonstrated the accumulation of 
GLA and octadecatetraenoic acid (€18:4 A*'^'*^'*^) to levels of 
13,2% and 9.6% of the total fatty acids, respectively. The 
borage A^-fatty acid desaturase differs from other desaturase 
enzymes, characterized from higher plants previously, by the 
presence of an N-terminal domain related to cytochrome ^5. 
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A^-Desaturated fatty acids are of major importance in animal 
cells as they have roles in the maintenance of membrane 
structure and function, in the regulation of cholesterol syn- 
thesis and transport, in the prevention of water loss from the 
skin, and as precursors of eicosanoids, including prostaglan- 
dins and leucotrienes (1). In animals, members of this class of 
fatty acids are synthesized from the essential fatty acid Hnoleic 
acid (C18:2 A^'»2), the first step being the desaturation to 
7-linolenic acid (GLA; C18:3 A^-^ '^) catalyzed by a A^- 
desaturase (1), Decreased activity of this key enzyme, ob- 
served for example in aging, stress, diabetes, eczema, and some 
infections, or increased catabolisni of GLA resulting from 

•dation or more rapid cell division (e.g., in cancer or 
lammation) may lead to a deficiency of GLA (reviewed in 
ref. 2). Clinical trials have shown that dietary supplementation 
with GLA may be effective in treating a number of such 
conditions (e.g., atopic eczema, mastalgia, diabetic neuropa- 
thy, viral infections, and some types of cancer; ref. 2). Oils 
containing GLA are therefore widely used as a general health 
supplement and have been registered for pharmaceutical use. 

In the plant kingdom, GLA is an uncommon fatty acid (3). 
Only a small number of higher plant species synthesize GLA, 
and in many of these, the fatty acid is found exclusively in the 
seed. GLA is also present in some fungi (e.g., Mucor javanicus) 
and cyanobacteria (3). Major commercial sources of GLA (4) 
are evening primrose {Oenothera spp.), in which GLA accounts 
for about 8-10% of the seed oil and borage (starflower) 
{Borago officinalis L.) seeds that contain some 20-25% GLA. 
These plants, however, suffer from poor agronomic perfor- 
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mance and low yield; borage, for example, produces 300-600 
kg/ha in the United Kingdom (4) compared with about 3 t/ha 
for oilseed rape. There is therefore considerable interest in 
both increasing the GLA content of existing crops and the 
production of GLA in a conventional oil crop (such as high 
linoleate rape). 

In the higher plant cell, the synthesis of saturated fatty acids 
with chain lengths up to CIS and monounsaturated fatty acids 
(generally with a double bond at the position) occurs in the 
plastid. Further desaturation can then occur either in the 
plastid or on the endoplasmic reticulum (ER; ref, 5). The 
desaturase enzymes of the plastid require reduced ferredoxin 
as an electron donor and are either soluble enzymes acting on 
saturated acyl-ACP substrates or membrane-bound enzymes 
using unsaturated fatty acids estcrified to complex lipids such 
as monogalactosyldialglycerol. In contrast, the ER-located 
A'-- and A'^-desaturases use fatty acids located at the sn-l 
position of phosphatidylcholine as'substrates, and cytochrome 
bs as a cofactor (5, 6). The A^'-fatty acid desaturase in the 
developing cotyledons of borage is similar to the A*-- and 
A^^-desaturases in its location and substrate specificity (oleate/ 
linoleate at the sn-2 position of phosphatidylcholine), and is 
assumed to use cytochrome bs as its electron donor (7, 8). In 
addition, a-linolenic acid esterified to phosphatidylcholine 
may act as a substrate, resulting in the accumulation o'f 
octadecatetraenoic acid (OTA; C18:4 A^'^ '2,i5\ j„ boraee 
leaves (9). ^ 
We describe the isolation of a cDNA clone encoding the 
A*-fatty acid desaturase from developing seeds of borage, 
using a PCR-based strategy. The identity of the cDNA has 
been confirmed by functional expression and analysis in trans- 
genic tobacco plants. The encoded protein differs from other 
membrane-bound fatty acid desaturases of plants, such as 
those encoded by the FAD genes oiArabidopsis (10, II), in that 
the desaturase domain is preceded at the N terminus by a 
sequence that is related to cytochrome bs (12), the haempro- 
tein involved in electron transport to other ER-located fatty 
acid desaturases (A'^^^"^) from higher plants (8, 13). 

MATERIALS AND METHODS 

Nucleic Acid Manipulations. Total RNA was isolated from 
developing seeds of borage (B. officinalis) using guanidinium 
thiocyanate according to the method described in ref. 14, 
Poly(A)+ RNA was purified from total RNA using oligo(dT) 
cellulo.se according to standard methods (15) and was used as 



Ahbrcviaiion.s: GLA, y-linolcnic acid; OTA. octadecatetraenoic acid: 
ER, endoplasmic reticulum; FAMc. fatty acid niclhyl ester: DMOX, 
4.4-dimcthyIoxazolinc; MS, mass spcctro'mciry. 

Data deposition: The sequences reported in thfs paper have been 
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a template for cDNA library construction. Single-stranded 
cDNA was synthesized from total RNA using the Reverse 
Transcription System (Promega) according to the supplier's 
instructions and used as a template for PGR amplification with 
degenerate primers. AH nucleotide sequences were deter- 
mined by the dideoxy chain termination method (15), and 
aligned using the GCG 8 program (16), 

PCR-Based Cloning, Two highly degenerate primers were 
synthesized for cDNA screening: forward primer A 5'- 

GCGAAITC(A/G)TXGGXCA(T/C)GA(T/C)TG(T/C)G- 
GXCA-3' (fully degenerate to the conserved amino acid 
sequence GHDCGH), and reverse primer B, 5'-GCGAATT- 
CATXT(G/T)XGG(A/G)AAXA(G/A)(A/G)TG(A7Gj: 
TG-3' (fully degenerate to conserved amino acid sequence 
HHLFP), where X substitutes nucleotides AGTC. Each primer 
contained an EcoKX site (underlined) at the 5' end to facilitate 
subsequent manipulations. These primers were used for PGR 
amplification with cDNA transcribed from total RNA Reac- 
tions were run on a Perkin-Elmer Getus DNA thermal cycler 
using a program of 1 min at 94°G, 1 min at 45°G, and 2 min at 
72**C for 35 cycles followed by extension for 10 min at 72°C 
PGR amplification products were separated on 1.0-2% aga- 
^l^k^Js. PGR fragments of the expected length (600-700 bp) 
purified using the Wizard DNA purification system 
(Promega), ligated into pGEM-T Vector according to the 
pGEM-T Vector Gloning Kit (Promega), and transformed into 
XlA'hXuQ Escherichia coli cells. Plasmid DNA was purified and 
sequenced using the Promega miniprep system. 

Library Screening. Poly(A)^ mRNA from developing seeds 
of borage was used as the template for the synthesis of a cDNA 
library; custom synthesis and packaging being carried out by 
CLONTEGH. The cDNA was inserted into the EcoKl site of 
the phage vector A ZAPII, and the resultant DNA was 
packaged into phage particles. The cDNA bank contained 
2.0 X 10^ clones with an average insert size of 2.0 kb. Filter 
replicas of this library were hybridized with the labeled DNA 
probe pBdesl and with a tobacco cDNA encoding cytochrome 
bs (17). Radiolabeling of DNA and screening of phage libraries 
were conducted using standard techniques (15). The full- 
length cDNA clone pBdes6 was isolated and sequenced on 
both strands. 

Northern Blot Analysis. RNA was separated by electro- 
phoresis through 1% formaldehyde agarose gel, transferred to 
nylon membrane (Hybond N, Amersham), and bound by 
expire to UV light for 2 min. Probes were made from the 
cljBpjclone pBdes6 by random priming (15). The filters were 
hyCTTOized and washed as described in ref. 17 and then exposed 
to x-ray film at -80°G using an intensifying screen. 

Plant Transformation. To facilitate preparation of plant 
expression constructs, flanking Sail and Smal restriction en- 
^e sites were added to the coding region of clone pBdes6 by 
^GR amphfication. Two oligonucleotides were synthesized 
based on the pBdes6 coding sequence: primer G, 5'-GCGT- 
eO^CATGGGTGGTGA A ATGAAG-3' (annealing to the"hii: 
tiating methionme, indicated in boldface type), and primer D 
5'-GCCCGGGTTAAGGATGAGTGTGAAG-3' (annealing 
up to the complement of the stop codon, indicated in boldface 
type). The Sail (primer G) and SmaX (primer D) restriction 
sites are underlined. The PGR product was purified and 
subdoned mto the vector pJD330 (18) to generate the plasmid 
p35Bdes6. Digestion of p35Bdes6 vfixUXbal released fragment 
of ^2,200 bp containing the ORF of the horaac pBdes6 
together with regulatory elements consisting of the cauliflower 
mosaic virus 35S promoter, an H-iransiational enhancer from 
obacco mosaic virus (19) and the nopaline synthase (nos) 
termination sequence. Th\^Xha[ fr^.gmcnt was gel purified and 
cloned mto pBIN 1 9 (20) to obtain the plasmid pNTdcs6 which 
rRAi^fiU'u'''*?''''^ '"^'^ ^^'ohactcrium tumcfacicns strain 
v-T/o? ^ tiJcciroporation. Tobacco {Nicotiana tahac uni cv. 
'>v:>; was transformed with the plant expression plasmid 
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according to standard procedures (21). Initial transformants 
were selected on 50 Mg/ml kanamycin and then transferred to 
100 /£g/ml kanamycin. Plants were maintained in axionic 
culture under controlled conditions. 

Fatty Acid Analysis. Lipids were extracted from leaves of 
transformed and control tobacco plants by homogenization in 
MeOH-GHGb using a modification of the method of Blioh and 
Dyer (22). The resulting GHGI3 phase was evaporated to 
dryness under nitrogen gas, and the samples were transmethv- 
lated with 1 M HGi in methanol at SO°G for \ h. Fatty acid 
methyl esters (FAMes) were extracted in hexane and purified 
using a small column packed with Florisil. Analysis of FAMes 
was conducted using a Hewlett Packard 5880A Series Gas 
Ghromatograph equipped with a 25 M X 0.32 mm RSL-500BP 
bonded capillary column and a flame ionization detector 
Fatty acids were identified by comparison of retention times 
with FAMe standards (Sigma) separated on the same GG. 
Quantitation was carried out using peak height area integrals 
expressed as a total of all integrals. 

GC-Mass Spectrometry (MS) Analysis. Fatty acid 4 4- 
Dimethyloxazoline (DMOX) derivatives were prepared for 
GG-MS analysis by a modification of the method of Fay and 
Richli (23). Lipid samples (extracted from tobacco leaves as 
described above) were heated at 180°G in 2-amino-2-mcthyl- 
1-propanoI under N2 for 18 h. After cooling to room temper- 
ature dichlororaethane and water were added. The DMOX 
derivatives were recovered in the dichloromethane, passed 
through a column of anhydrous sodium sulfate to remove 
water, and dried under a stream of N2. To remove any 
contaminating polar material the samples were taken up in 
hexane, passed through a short Florisil column, and evapo- 
rated to dryness. The samples were then dissolved in an 
appropriate volume of hexane for GC-MS analysis. Fatty acid 
DMOX derivatives were analyzed by GC-MS on a Hewlett 
Packard 5890 Series II Plus gas chromatograph equipped with 
a 50 M X 0.25 mm BPX70™ capillary column connected 
directly to a Hewlett Packard 5989B MS Engine quadropole 
mass spectrometer operating at an ionization energy of 70 eV 
and emission current of 300 fxA. Mass spectra were interpreted 
by comparison to the mass spectra of DMOX derivatives of 
GLA and OTA prepared from blackcurrant oil, which is known 
to contain both these fatty acids (24), using the interpretation 
rules of ref. 25. 



RESULTS 

PCR-Based Cloning of Membrane-Bound Desaturases 
Comparisons of the deduced amino acid sequences of mem- 
brane-bound fatty acid desaturases (and related proteins) from 
mammals, fungi, insects, higher plants, and cyanobacteria 
reveal three highly conserved regions (boxes) containing his- 
tidine residues (26). Since the borage seed A^-desaturase is 
membrane-bound (27), two highly degenerate primers were 
constructed based on the sequences of the first and third 
histidine boxes present in the membrane-bound A '2- and 
A - -fatty acid desaturases of plants. These primers were used 
in PGRs with cDNA transcribed from total RNA of developing 
^^/^.^^^'n PCR products of the predicted length 

(600-700 bp) were cloned and sequenced, allowing them to he 
classified mto three groups: 45% showed similarity to other 
protems (i.e.. not fatty acid desaturases), 35% resembled 
A -desaturases, and 20%. formed a separate group that 
showed some similarity to both A^-- and A '^-desaturases but 
was clearly distinct from the second group. 

Sequencing of a representative clone (pBdesI) from the 
third group revealed an ORF of 228 aa with three putative 
hrstidme boxes. Alignment of the deduced amino acid .se- 
quence with those of known desaturases (data not shown) 
showed highest similarity to the A' Vlcsalurases, although the 
actual level ol identity was h)w (less than 30%). Since borage 
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seed oil contains little or no a-linolenic acid, it is unlikely that 
high levels of transcripts for A'^-desalurases would be present 
in the developing seeds. It was therefore considered likely that 
the pBdesl PCR product encoded part of a putative 
desaturase. 

To isolate a full-length clone corresponding to the pBdesI 
PCR product, the insert was used to probe a borage developing 
seed cDNA library constructed in A ZAPII. A total of 3 X 10^ 
plaques were screened, and 20 individual phage clones that 
hybridized with the pBdcsl DNA probe were identified and 
purified by further rounds of hybridization. Restriction en- 
zyme digestion of 15 clones recovered from positive plaques 
showed the presence of single inserts that hybridized with ihc 
probe, ranging from 700 to 1,800 bp in length. One of these, 
termed pBdes6, containing an insert of 1,800 bp, was chosen 
for detailed analysis. 

pBdes6 encodes a J,344-bp ORF, preceded by a 41 -bp 5' 
untranslated region. The coding region was followed by a stop 
codon and a 345-bp untranslated region with a poly(A) tail. 
The ORF encoded 448 aa, corresponding to a putative protein 
with an Mr of about 50,000, which is significantly larger than 
U|^redicted of other microsomal desaturases such as the 
■pfand A'^-desaturases from Arabu/opsis (refs. 10 and 1 1; Fig. 
l^A degree of similarity to other fatty acid desaturases is 
clear, but only over a part of the coding sequence. The amino 
acid sequence from residues 144 to 448 showed about 17% 
identity with A'^ (FAD3) (10) and A'- (FAD2) (10) desatu- 
rases from Arabidopsis and about 22% identity with a A"- 
desaturase from the cyanobacterium Synechocystis (28). The 
whole sequence was also 60% identical to a cDNA clone of 
unknown function isolated from sunflower seeds (29). The 
three conserved histidine boxes that are characteristic of other 
membrane-bound desaturases were also present, and located 
at similar positions within the .sequence. The distance between 
the first and second boxes was 32 aa, compared with 31 or 32 
aa in A'^- and A'-*^-desaturases, and the distance between the 
second and third boxes was 172 aa, compared with 132-173 in 
other membrane-bound desaturases. The importance of these 

Atfada 
At£ad2 
Bodea6 
Synd«96 
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histidine boxes in catalysis has been demonstrated by site- 
directed mutagenesis of the soluble A*'-desaturase from rat and 
A'--desalurase oi Synedwcystis (26, 30). 

pBdes6 Encodes a Protein Containing a Cytochrome bs- 
Like Hcme-Binding Doninin. The predicted hydrophobicity 
plot for the protein encoded by pBdesfi revealed a profile 
characteristic of a fatty acid desaturase, with the histidine 
boxes located in hydrophilic areas and separated by hydro- 
phobic domains (not shown). The borage protein, however, 
contained a hydrophilic region at the N terminus longer than 
those of the membrane-bound A'-- and A '■'^-desaturases. Closer 
analysis showed significant sequence similarity between the 
first 90-100 aa at the N terminus of the protein encoded by 
pBdcs6 and microsomal cytochrome bf, proteins from higher 
plants (17). This similarity included the presence of seven of 
the eight invariant residues of the cytochrome bs class of 
proteins identified by Lederer (12). A heme-containing elec- 
tron donor is required for fatty acid desaturation, and cyto- 
chrome /;>5 is known to fulfill this function with membrane- 
bound fatty desatura.ses (A'- and A'-\ refs. 8 and 13) and with 
the related A' '-hydroxylase (31). We therefore isolated a 
cDNA for cytochrome b^ from the borage cDNA library using 
a tobacco cDNA (17) as a probe. Sequencing of this cDNA 
revealed an ORF encoding 132 aa which had some 80% 
sequence identity to cytochrome /?5 proteins from tobacco and 
rice (17). It aKso showed 32% sequence identity with the 
cyiochrome Z?5-related domain of the protein encoded by 
pBdes6 (Fig. 2). The identity is particularly high in regions 
previously identified as essential for cytochrome 65 function, 
including the EHPGG motif in the heme-binding region. 

Functional Analysis of pBdes6 in Transgenic Tobacco. To 
confirm the identity of pBdes6 as a A^'-fatty acid desaturase, 
the cDNA was transferred to tobacco plants under the control 
of an fl-enhanced cauliflower mosaic virus 35S promoter via 
Agrobacter{um'n\c6'vAic6 gene transfer. Single leaves were 
removed from transformed and control plants, and FAMes 
were prepared from total lipid extracts and analyzed by GC 
(Fig. 3). Two peaks were observed in the chromatogram of 

49 

46 
99 
48 



m(^amdqrtnvngdpgagdr 



HAAQIKKX'ITSDBI.KNHDKPGDLWI S IQGKAYDVSCWVKDH PGGS PFX.KS^GQEVTDAFVAFH PA 



eS-fdpsaqpe 

l-V-PCEKPP 



-MLTASRZKFTQKRG- 



- FRgSgNQRV-DAYFAEHG 



jgFETGYYLKDQSBS EgSKDYRKLV 
QRDNP5MYI. KTLIIV 



Bodes 6 
Syndea6 




137 
141 
195 
123 



Atfad3 
At£ad2 
Bodes 6 
Syndeae 



^PjJPERVYKKLP H S' 

SNTGS*TOg KvRRpKQKSAIKWYGKYIiNMP LGgl 

ACNSggVgPDLQYIpCTvgsSKFFGSLTSHFYEKRI.TFDSj 
CfzjfGHiSvEIHGEDGAQRjsPB QBHVGI 





iPLPMLAgPj^CPpSPGKE GSHTOPYSSI^PSEgKLIATSTT 218 

YQHWTQYPlJcAAR UJb^VQSLIMLI.TK ^-gNVsJwAHE 285 

-WGLYjJggpjjYW P-i2dVYI.V3^NKG^HD-h2iPP5QPI.E 202 



Atfad3 
At£ad2 
Bodes 6 
Syndes6 



CWS iTfVsJiALS FflggPIAV 



IiA5L>3GIFj 



WYPULfflsdgP- 
L.WLGY*ggipiAffG - 



.HHHGHDEKgp 
aS'Js VTGMQQVQ Fs5 




5-R< 

as SSVYVGKP|g NggjFEKQTDGTIiD I S C - P P^ 

LTP^ESGAjJJjDEjSAEJCQIRTTANFATNN 




301 

312 
370 
300 



Atfad3 
At£ad2 
Bodes 6 
Syndes 6 




jEGRVrREPKTSGAgPIHLVESLVi^gjKDHYVSDTGDrVFyETDPDLYVYASDKSKIN 38 6 

^GD YYQ E'DGT PgYgjAMYREAKECI YfijgPDR - EGDKKGVYWYNWKI, 383 

P^I E LCKKHNI.PgW •- YA S FS KAireMT> Wg TLRMTALQARDI TKPL Pian.VWBALKTHG — 448 

" CQEFGVEQKjjYPTFKAAIASNYgWLEAMGKAS 359 
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Fig. 2. Comparison of the deduced amino acid sequence of pBdes6 with plant cytochrome hs sequences. The first 196 residues of pBdes6 
(bodes6) were compared with cytochrome ^5 sequences from borage (bocytbS), rice (oscytbS), and tobacco (ntcytb5). The conserved heme-binding 
domain is underlined. The sequence of borage cytochrome has been deposited in the GenBank database (accession no. U7901 1). 



FAMes from the transformants (Fig. 3B) not present in the 
control plants (Fig. 3/4). These peaks had retention times 
identical to the FAMe standards of GLA (C18:3 A'^-^-^^) 
OTA (C18:4 A^'^'*- '^). Further analysis of the transformant by 
GC-MS analysis of fatty acid DMOX derivatives confirmed the 
identities of these peaks (Fig. 4). Both spectra contained 
abundant m/z 113 (McLafferty rearrangement ion) and 126 
peaks typical of fatty acid DMOX derivatives (23). The 
spectrum of the putative GLA derivative (Fig. AA) had a 
molecular ion at m/z 331, suggesting an octadecatrienoic fatty 
acid; gaps of 12 amu between m/z 194 and 206, and m/z 234 
and 246, indicating double bonds at C9 and CI 2; and a 
prominent m/z 166/167 pair specific for a C6 double bond 
(25). The spectrum of the putative OTA (Fig. 4B) had an 
additional gap of 12 amu between itj/z 274 and 286, indicating 
the presence of a CI 5 double bond. 

The proportions of fatty acids in the total lipid fractions 
prepared from the leaves of the control and transformed 
tobacco plants are given in Table 1. GLA and OTA account for 
about 13% and 10% of the total, respectively, in the transgenic 
material and are absent in the control plants. The presence of 
both GLA and OTA indicates that the A^-desaturase used both 
linoleic acid and a-linolenic acid as substrates, and this may be 
responsible for the decrease in a-linolenic acid observed in the 
transgenic line. 

Northern Blot Analysis. To provide further evidence that 
the introduction of the borage cDNA into the tobacco genome 

nvas responsible for these novel desaturation products, total 

^RNA was isolated from the leaves of either a GLA-positive 
transgenic tobacco plant or a control plant that had been 
subject to the same tissue culture regime. RNA was also 
isolated from developing borage seeds and leaves, and the 
samples were analyzed by Northern blotting and probed with 
the pBdes6 cDNA (Fig. 5). A positive hybridization signal of 
identical mobility was obtained from RNA isolated from 
borage seeds and the transgenic GLA-positive tobacco line, 
but not from the control tobacco plant. Prolonged exposure of 
the autoradiograph showed that low levels of the pBdes6 
transcript (or related transcripts) were present in the RNA 
samples extracted from borage leaves, a result that is consistent 

with the observed accumulation of GLA in the leaves of this 

species (9). 

DISCUSSION 

We undertook to isolate a cDNA encoding a A'*-de.sa lura.se 
from borage using a degenerate PGR approach based on 
conserved amino acid sequence motifs in other microsomal 
fatty acid desaturases (26, 30). Previous studies (9, 27) had 
shown that the borage A^'-dcsalurasc activity was associated 
with the microsomal membrane fraction and probably used 



cytochrome ^5 as an electron donor, like the microsomal 
A'2 (FAD2) and A*-'' (FAD3) desaturases. The borage cDNA 



A 




Reieniion time 

Fici. ?>. idciuificaticui of CiLA and OTA in triinsgcnic tobacco by 
GC. ChromaioLiranis of FAMcs from leaf tissue of control tobacco 
plain {A ) or plant iiansft>rnicd with pBdcs6 {B). Two rK)vc] peaks arc 
seen in /^. these peaks have retention limes identical to TAMe 
standards of C iLA and O TA. The identity of peaks (as dctcrniincd by 
ci)mparison of rcicnlion limes with those of known sunidards) is 
indicated. Dclcclion was hy flame ionization. 
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Fig. 4. Mass spectra of DMOX-derivatized fatty acids. Spectra of 
the fatty acids identified in Fig. 3 as GLA {A) and OTA {B). Details 
of the interpretation of the spectra are given in the text. The deduced 
structures of the fatty acid derivatives arc shown. 

clone (pBdes6) was confirmed to encode a A^-desalurase by 
ectopic expression in the leaves of transgenic tobacco, resulting 
in the accumulation of the fatty acids GLA and OTA. The 
borage A^-desaturase encoded by pBdes6 differed from pre- 
viously characterized fatty acid desaturases from higher plants 
by the presence of an N-terminal extension related to the 
cytochrome bs class of heme-binding proteins (13). This do- 
main is not present in the plant microsomal A'-- and A'^- 
desaturases (10, 11) or in the related A'--hydroxylase (31) 
which have been cloned and functionally characterized in 
transgenic plants, although their use of microsomal cyto- 
chrome bs as an electron donor has been clearly demonstrated 

«^2). It is also clear that the N-terminal cytochrome b^- 
^ted domain of pBdes6 is structurally distinct from the 
rage microsomal cytochrome bs, as it does not contain the 
conserved hydrophobic C-terminal microsomal membrane 
anchor normally present in cytochrome bs proteins (17, 33). 
Since cytochrome bs usually functions in association with the 

Table 1. Total fatty acid content of lipid extracts from leaves of a 
control tobacco plant and a plant transformed with the borage A* 
desaturase clone pBdes6 

% Fatty acid 



Acids 


Control 


Transformant 


Palmitic 


(CI 6:0) 


16.3 


14.0 


Palmitoleic 


(C16:l) 


Trace 


Trace 




(CI 6:3) 


5.0 


9.0 


Stearic 


(CJ8:0) 


2.4 


1.5 


Oleic 


(C18:1) 


Trace 


1.3 


Linolcic 


(C18:2) 


9.1 


9.5 


7-Linolcnic 


(CI 8:3) 


ND 


13.2 


a-Linolcnic 


(CKS:3) 


65.1 


40.1 


OTA 


(C1S:4) 


ND 


9.6 



Percentages were iniegraictl from peak areas of GC traces shown in 
Fig. 3. ND, nt)i detected. 



Fig. 5. Northern blot analysis of pBdes6 expression in borage and 
in transgenic tobacco. Total RNA (10 ^g), extracted from borage 
leaves (Bl), borage seeds (Bs). control tobacco leaves (CTl) or 
transgenic tobacco leaves (TTl) was probed with -^-P-labcled pBdes6. 
After hybridization and high stringency washing, the resulting auto- 
radiograph indicated expression of the pBdes6 transcript (=*=2,000 bp; 
marked with the arrowhead) in borage seeds and transgenic tobacco 
leaves. The positions of the rRNA bands are indicated. 

ER membrane, it is likely that the fusion protein described in 
this study has the same location and this is supported by the 
absence of any domains resembling chloroplast targeting tran- 
sit sequences (34). Although the protein encoded by pBdes6 
does not appear to have an N-terminal cleavable ER-targeting 
signal sequence (as judged by computer searching), the hy- 
drophobic regions present in the protein would be sufficient to 
allow it associate with the endomembrane system. No obvious 
ER-retention motifs are present, but a potential glycosylation 
site is present at residues 278-280 (N-V-S). 

Domains related to cytochrome b^ are also pi-esent in a 
microsomal A'^-fatty acid desaturase (Olelp, the OLEl gene 
product) from yeast (35) and in other oxido/reductase en- 
zymes (e.g., nitrate reductase, sulfite oxidase, and flavocyto- 
chrome 62; ref. 1 2). In the yeast A^-desaturase, this cytochrome 
^75 domain exists as a 113-aa C-terminal fusion. Expression of 
OLEl from a multicopy plasmid rescued yeast double mutants 
that lacked both OLEl and microsomal cytochrome bs genes, 
unlike rescue by a rat microsomal A^-desaturase, which re- 
quired the presence of the cytochrome bs gene (35). Moreover, 
when the C-terminal bs domain of OLEl was deleted, the yeast 
cells remained fatty acid auxotrophic, even in the presence of 
endogenous yeast cytochrome bs, indicating that cytochrome 
bs is not able to act in trans to complement the loss of the 
cytochrome ^5 fusion domain of Olelp (35). This suggests that 
the fusion domain plays an essential role in the desaturase 
reaction of this enzyme. A cDNA clone encoding a related 
cytochrome bs fusion protein has also been isolated from 
sunflower seeds (29), as noted above, but the corresponding 
protein has not been identified. In the sunflower protein, the 
cytochrome 65 domain is fused to the N terminus of a putative 
desaturase sequence, as in the pBdes6 protein, and expression 
of this domain (=«120 residues) in E. coU (29) has shown that 
it is capable of undergoing reversible oxidation and reduction, 
indicating a functional heme group. Similar results were also 
obtained by expression of a tobacco cytochrome bs cDNA in 
E. coli (33). However, sunflower seeds do not accumulate GLA 
(3) and would therefore not be expected to possess an active 
A^'-desaturase. The substrate specificity of this sunflower pro- 
tein is not known, and its role in fatty acid dcsaturation/ 
hydroxylat ion -type reactions can only be inferred from se- 
quence homology. Similarly, the functional and evolutionary 
significance of the existence of two types of mcmbranc-bound 
desaturases in plants is nol clear, although il can be suggested 
that the fusion of a cytochrome bs domain to the desaturase 
may facilitate a more efficient electron transfer. It is also 
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jinctear why the yeast Olelp desaturase has a C-terminaJ 
grtophrome 65 domain, whereas the borage desaturase has an 
N-terminal cytochrome bs domain. 

Recently, GLA and OTA accumulation in transgenic plants 
has been reported by Reddy and Thomas (36), who expressed 
a cyanobactenal A^-desaturase gene in tobacco. The combined 
evels of GLA and OTA varied from aboiit 2% to 4% of the 
leaf C:18 fatty acids, with only small differences dependino on 
whe her the protem was targeted to' the plastid, cytoplasm, or 
ER lumen. This low level of activity is perhaps not surprising 
as the cyanobactenal A«-desaturase differs from the ER- 
ocated higher plant desaturases in using ferredoxin rather 
than cytochrome 65 as a cofactor. The cyanobacterial A'^- 
desaturase also resulted in the accumulation of comparatively 

known. The levels of GLA and OTA accumulating in the leaves 
^L^TIl""" n^^*'^ P'^"'' expressing the borage desaturase 
encoded by pBdes6 account together for over 23% of total 
fatty acids, indicating the potential for producing GLA in 
transgenic oil crops. Sunflower would be particularly suitable 
m this respect as the presence of between 50% and 70% 
linoleic acid and with Uttle or no a-linolenic acid (37) should 
facilitatethesynthesisof high levels of GLA. 
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Identification of a Caenorhabditis elegans AS-fatty-acld-desaturase by 
heterologous expression in Saccharomyces cerevisiae 
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We identified a cDNA expressed sequence tag from an animal 
(the nematode worm Caenorhabditis elegans) that showed weak 
similarity to a higher-plant microsomal A^-desaturase. A full- 
length cDNA clone was isolated and expressed in the yeast 
Saccharomyces cerevisiae. This demonstrated that the protein 
.rncoded by the C. elegans cDNA was that of a fatty acid A*- 
^:aurase, as determined by the accumulation of y-linolenic 



acid. The C elegans AMesaturase contained an N-ierminal 
cytochrome domain, indicating that it had a similar structure 
to that of the higher-plant A®-desaturase. The C. elegans A^- 
desaturase mapped to cosmid W08D2, a region of chromosome 
III. This is the first example of a A«-desaturasc isolated from 
an animal and also the first example of an animal desaturase 
containing a cytochrome domain. 



INTRODUCTION 

Over the last few years, a number of microsomal and soluble 
fatty acid desaturases have been isolated from higher plants, 
most notably Arabidopsis thaliana (thale cress). This has been 
achieved by a combined genetic and biochemical approach to the 
neraiion and complementation of mutant Arabidopsis lines 
cctive in fatty acid desaturation or elongation [1]. The 
iinportance of this approach has been clearly validated by the 
isolation and characterization of genes encoding microsomal 
desaturases such the A»« [2] and A^^ [3] (encoded by the FAD2 
and FAD3 genes respectively) enzymes, which had previously 
proved intractable to classical purification techniques on account 
of their hydrophobicity. The isolation of these and related genes, 
such as the A^^-hydroxyX^se from Ricinus communis (castor bean) 
[4], has allowed the identification of a number of conserved 
otifs in plant microsomal desaturases, most notably the so 
.ciled ^histidine boxes' [5]. These short motifs appear to be 
required for enzyme function and also allow the proteins 
containing these motifs to be classified as di-iron-ccntre-con- 
k taining enzymes [6]. 

■ Recently we isolated a cDNA clone from borage (Borago 
officinalis), using highly degenerate PGR against these histidine 
motifs, which was shown by heterologous expression in trans- 
genic tobacco (Xicotiana tabacum) to encode a microsomal A*- 
'lesaturase [7], Desaturation at the A« position is an unusual 
odification in higher plants, occurring only in a small number 
! species such as borage, evening primrose {Oenothera spp.) and 
rcdcurrant {Ribes spp.), which accumulate the A*-unsalurated 
fatty acids y-Iinolenic acid (GLA) and octadecatetraenoic acid 
m the seeds and/or leaves. GLA is a high- value plant fatty 
acid and is widely used in the treatment of a number of 
medical conditions, including eczema and mastalgia. It has been 
postulated that the application of GLA replaces the loss of 
endogenous A^-unsaturated fatty acids [7]. The sequence of the 
fx^rage microsomal A«-desaturase differed from previously char- 
ierized plant microsomal desaturases/hydroxylases in that it 
riiained an N-terminal extension which showed sequence 
similarity to cytochrome and also in that the third (most C- 



terminal) histidine box varied from the consensus [6] H-X-X-H- 
H, with a glutamine residue replacing the first hisddine one. 

Although A®-fatty-acid desaturation is an unusual modification 
in higher plants, it is a common reaction in animals. The essential 
fatty acid linoleic acid (C„.2.a9.i2) >s desaturated to GLA by a A«- 
desaturase as the first step on the biosyntheiic pathway of the 
eicosanoids, which includes prostaglandins and leukotrienes. 
This results in the rapid metabolism of GLA [to dihomo-GLA 
(C20:3.A8,ii.i4) ^nd arachidonic acid (Cjq.^^s^ ^J], so accumu- 
lation of this fatty acid is not usually observwi. For example, in 
the model animal system, the nematode Caenorhabditis elegans, 
polyunsaturated fatty acids which have been A*-desaturaied (in 
the form of arachidonic and eicosapentanoic acids) make up over 
20% of the fatty acids of the total Upids. but no GLA is 
observed [8]. This is presmnably due to its rapid elongation 
to C^o fatty acid derivatives. 

We wished to determine whether the A*-desaturase isolated 
from borage was representative of A«-desaturases as a whole. 
Since most higher plants do not contain this enz>Tne [7], we 
decided to take advantage of the large amount of animal 
sequences available on public databases. To this end we identified 
a putative C elegans A*-desaturase expressed sequence tag (EST) 
and verified its function by expressing the corresponding cDNA 
in yeast. When the nematode coding sequence was expressed in 
yeast supplemented by the addition of linoleic acid, GLA was 
produced. This was confirmed by GC-MS, identifying the coding 
sequence similar to the C. elegans predicted open reading frame 
(ORF) W08D2.4 as a AMesaturase. 



MATERIALS AND METHODS 

The National Center for Biotechnology Information (NCBI) EST 
sequence database was searched for polypeptide sequences which 
were related to the higher-plant A«-fatty-acid desaturase [7] and 
contained the variant histidine box Q-X-X-H-H. Putative positive 
C. elegans ESTs were further characterized by searching 
the C. elegans EST project database (hllp://ww\v.ddbj-nig.ac.jp/ 



^ea'^d'lSame"' ^"^^^'^^^ "^^"^"^^ GLA. y-Nndenic acid: NCBI. National Center for Biotechnology Inforr.ation: ORF. open 

' To whom correspondence should be sent. 
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htmls/c-elegans/html/ceMndex.html) in order to identify related 
cosmid clones. 

A partial cDN A clone identified by these searches was obtained 
from the C. elegans EST project (kindly supplied by Professor Y. 
Kohara, National Institute of Genetics. Mishima, Japan)» and 
this was used to screen a C. elegans cDNA library (mixed stage; 
also supplied by Professor Y. Kohara) constructed in AZAPll. A 
number of positives were identified and further purified, and full- 
length clones were confirmed by sequencing to encode a transcript 
likely to have been transcribed from the gene designated 
W08D2.4, on cosmid W08D2, as determined by database search- 
ing of the genes sequenced by the C. elegans genome project. 

The coding sequence of W08D2.4 was introduced into the 
yeast expression vector p'i'ES2 by PCR. Oligonucleotides with 
5' overhangs were used to introduce Kpu\ and Sad sites at 
the 5' and 3' ends respectively. The fidelity of the construct was 
checked by in vitro transcription and translation using the 
7>ir system (Promega). 

The resulting plasmid was introduced into yeast {Saccharo- 
myces cerevisiae) by the lithium acetate method [9], and ex- 
^pression of the transgene was induced by addition of galactose. 
Fjhe yeast was supplemented by the addition of 0.2 mM linoleate 
in the presence of 1 % tergitol, following the method of [10]. 

Yeast total fatty acids were analysed by GC of methyl esters, 
exactly as described previously [7]. Confirmation of the presence 
of GLA was carried out by GC-MS using a Kratos MS80RFA 
instrument operating at an ionization voltage of 70 eV, with a 
scan range of 500-40 Da. The mass spectrum of the novel peak 
resolved by GC was compared with that of an authentic GLA 
standard (Sigma). 



RESULTS 

The sequence of the borage A^-desaturase was used to search 
databases for related sequences in species which, although they 
do not accumulate GLA. might be expected to perform A*- 



desaturation. The simplest organism which fulfilled this criterion 
was the free-living nematode C. elegans. This small animal has 
been subject to both random cDNA (EST) sequencing programs 
and large-scale genome sequencing. Our searches of EST data- 
bases identified a high-scoring nematode EST, namely yk436bl2. 
This partial sequence of 448 bases was used to search for related 
cosmid clones sequenced by the C. elegans genome project, using 
the DNA database of the Japan C. elegans EST project server. 
This indicated that the clone yk436bl2 showed sequence simi- 
larity to part of a gene present on cosmid W08D2 (GenBank 
accession number Z70271). which forms part of chromosome III 
[II]. Bases 21-2957 of cosmid W0SD2 are predicted by the 
protein prediction program Genefinder [I I J to encode an ORF of 
473 residues which is interrupted by five introns. Examination of,; 
this predicted protein sequence (designated W08D2.4 by the> 
Sanger Centre Nematode Sequencing Project, Hinxton, Saffron; 
Walden, Essex, U.K.) revealed that it had a number of charac-^ 
teristics reminiscent of a microsomal fatty acid desaturase,,' 
including three histidine boxes. However, the predicated proiein,| 
sequence indicated the presence of an N-terminal domain similar j 
to that of cytochrome b.^, containing the diagnostic H-P-G-G' 
motif found in cytochrome b-^ proteins [12]. Since the A* 
desaturase isolated by us from borage [7] also contained an N-j 
terminal b^ domain, this indicated that W08D2.4 may encode a^ 
A^-desaturase. Closer examination of the sequence revealed the^ 
presence of the variant third histidine box, with a H Q, 
substitution (again as observed in the borage A®-desaturase)J 
However, the similarity between W0SD2.4 and the borage A*' 
desaturase is low (51.7 °o), as is the value of 31.0 °o for identity 
Since W08D2.4 was encoded by a gene containing many introns 
it was necessary to isolate a full-length cDNA to verify the 
sequence predicated by the Genefinder program [1 1] and also to 
allow the expression of the ORF to define the encoded function. 

A cDNA library and EST yk436bl2 were generously provided 
by Professor Y. Kohara. and a number of positive plaques were 
identified by screening with the EST insert. These were further^ 
purified to homogeneity, excised, and the largest inserts (--■ 
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Figure 1 A comparison of the deduced amino acid sequences ot the borage (fi. officinalis) A*-desaturase [7] and the C. elegans cONA CeD6.1 

Abbreviations: Ceeld6. CeDS.I : Boo-:': borage A- -desaturase. 
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Figure 2 Identification of GLA in transgenic yeast by GC 

-**..: esters of total lipids of 5 cerevisiae^xom under Inducing conditions (Hnoleate 
.vere analysed by GC. us: -g (lame-ionization detection. (A) is yeast transformed 
. :■: r^ipty) vector pYES2 and iB) is transformed with pYCeDS.l. The common peaks 
• e: as 0,^ 0 (peak 1), C,- ^ u^ak 2). C,g g (peak 3). C,5 , (peak 4) anc Z,^ ^ (Peak 
e-ogenously). Tr.e additiona: :eak (peak 6 in B). which corresponds lo the retention 
|"'^e .'s indicated by the arrowttead. 



l-l^'' bp) trom the resulting rescued phagemids were sequenced. 

' '^'^ c'^nfirmed that the cDNAs isolated by us did indeed show 
> to W08D2.4, vviih the 5' and 3' ends of the cDNA 
' ^-i-nvalent to bases 9 and 3079 of the sequence of cosmid 
"1^-. Since the ATG initiating coding predicted by the 
<"->ici;nder program to be the start of gene product \V08D2.4 
indeed the first methionine residue in the cDNA clone, we 
'masoned that we had isolated a bona fide full-length cDNA. One 
'^prcsentaiive cDNA clone (termed Cede.l; 1463 bp in length) 
'I '- sequenced on both strands (Genbank ID: AF031477); the 
\\m^ 1"""^^ ^^""^^"0 acid sequence is identical with that predicted for 
'"'^--l o\er the majoriiy of the protein. However. DNA 
- ^.-ncoding residues 38-67 (V-S-I...L-Y-F) predicted for 
• 'ire not present in the cDNA clone. This means that 
--^iuccd amino acid sequence of pCeD6.1 is in fact 443 
'7' r ''"^ -^^ 'ong- as opposed to that predicted for W08D2.4, 
il,'!' ^^sidues in length. The only other diflerence between 

J^y-^o amino acid sequences isan M -* V substitution at residue 
resulting from a A-G base change (base 1211). The 
-'nrr "^-^"^ sequence of CeD6.1 is shown in Figure L 
^P'lrcd with the previously characterized borage A*-desaiurasc 



[7]. Note the presence in the C elegans sequence of the H-P-G- 
G cytochrome motif in the N-ierminus (encoded by bases 
96-108) and the H -» Q subsiiiution in the third histidine box 
(encoded by bases ] 157-1 172i. 

. Clone pCeD6. 1 was then uied as a template for PGR ampli- 
fication of the entire predicated coding sequence (443 amino acid 
residues in length) and cloned into the yeasi expression vector 
pYES2 (Invitrogen) to yield p^ CeD6. The fidelity of this PGR- 
generated sequence was checked by in vitro transcription/ 
translation of the plasmid. using the T_ RNA polymerase 
promoter present in pYES2. Using the Promega 7>7 7'-coupled 
transcription/translation system, translation products were 
generated and analysed by SDS/P.AGE and autoradiography, 
following the supplier's instructions. This revealed (results not 
shown) that the plasmid pYCeD6 generated a product of mole- 
cular mass 55 kDa. whereas the control (pYES2) failed to yield 
any protein products, indicating thai the construct was correct. 

Transformation and selection of yeast able to grow on uracil- 
deficient medium re\ealed yeast colonies carrying the recom- 
binant plasmid p YCeD6 by vinue of the URA3-seleciable marker 
carried by pYES2.. Expression of pYCeD6 was obtained by 
inducing the GAL promoter which is present in pYES2. This was 
carried out after the cells had been grown up overnight with 
raffinose as a carbon source, and the medium supplemented by 
the addition of linoleate (Cj. .^^ i^) in the presence of low 
concentrations of detergent. This latter addition was required 
since the normal substrate for A'^-desaturaiion is Cj^ fatty acid, 
which does not normally occur in 5. cerevisiae. The cultures were 
then allowed to continue to grow after induction, with aliquots 
being removed for analysis by GC. When methyl esters of total 
fatty acids isolated from yeast canning the plasmid pYCeD6, 
grow n in the presence of galactose and linoleate. were analysed by 
GC- an additional peak was observed (Figure 2). This had the 
same retention lime as an authentic GLA standard, indicating 
that the transgenic yeast was capable of desaturating linoleic 
acid at the position. No such peaks were observed in any of 
the control samples (transformation with pYES2). The identity 
of this extra peak was confirmed by GC-MS, which positively 
identified the compound as GLA (Figure 3). This confirms that 
Cede.l encodes a C. elegans A*-desaturase, and that this cDNA 
is likely to be transcribed from ihe gene predicted to encode ORF 
W08D2.4, though the deduced amino acid sequence of Cede. 1 is 
30 residues smaller than that of W08D2.4. 



DISCUSSION 

Organisms such as C. elegans pertorm A^'-desaiuration, but 
unlike plants such as borage or evening primrose, they do not 
accumulate A^ unsaturated tatty acids such as GL.A. We provide 
evidence that a C. elegans cDNA (Cede.l) encodes a A**- 
desaiurase, and that this sequence is similar to the predicted 
ORF W08D2.4, except for a 3*>residue insertion present in the 
N-terminal region of the latter protein. Whether the deduced 
amino acid sequence predicted for Cede.l represents a splicing 
variant of W08D2.4. or is a result of a misprediction of the 
intron/exon junctions by the Genefinder program is unclear. 
However, it is clear that Cede.! encodes a A''-desaturase. The 
ORF encoded by this C. elegans sequence appears to be related 
to the higher-plant A^-tatty-acid desaturase previously isolated 
by us [7], in that they both contain .N'-ierminal domains which 
show similarity to cytochrome t,. \v\ contrast, other microsomal 
fatly acid desaturases from plants do not contain this domain 
and use free cytochrome a> an electron donor [1,13,14]. 
Similarly, the domain is absent from the only fatty acid 
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Rgure 3 GC-MS analysis of the novel peak Identified in yeast carrying pYCeDB.I 

The sampie was analysed for mass spectra as described previously (7]. and the dat£ .erg usee to search a iibrary of profiles. The sample was identified as -3 A A coTiparison of the mass 
01 the novel peak (A) and an authentic GLA standard (B) is shown. Visual- and co-outer-b£sed inspection indicates that the two spectra are identical. 



desaturases isolated from animals, a desaturase from C. elegans 
which recognizes a range of C.g and Cjo.^e substrates [15] and a 
putative fatty acid desaturase from man {Homo sapiens) [16], 
These animal sequences also differ from the borage and C. 
elegans A*-desaturases in lacking the variant histidine box. 
j The reason why the A*-desaturases have a fused cytochrome 
f domain is not known [17]; the only other examples of desaturases 
with this extension are fungal microsomal (OLEl ) A^-desaturases 
[10] in which the domain is fused to the O terminus rather than 
the N-terminus of the protein. However, the borage A«-desaturase 
differs from all the other characterized plant microsomal desat- 
urases in carrying out 'front-end' desaturation. which is the 
introduction of a double bond between C-3 and C-7 of an 
already unsaturated fatty acid [18]. This means the enzyme 
desaturates at positions between the carboxy group and pre- 
existing double bonds, whereas other plant enzymes desaturate 
sequentially towards the methyl group. It will be of interest to 
determine whether this feature is shared by other *froni-^nd' 
desaturases of plant and animal origin. It is also clear that 
identification of heterologous fatty acid desaturases . will be 
facilitated by the yeast expression system described in the present 
study. 

We are very grateful to Professor Yuji Kohara for supplying nematooe ESTs a.nc :ONA 
libraries. We thank Mervyn Lewis for carrying out the GC-MS analysis. lACF-Long 
Ashton Research Station receives grant-aided support Uom the Biotechnolo-* and 
Biological Sciences Research Council (U.K.). 
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Identification of a novel A6-acyl-group desaturase by 
targeted gene disruption in Physcomitrella patens 



Thomas Girke\ Hermann Schmidt^, Ulrich Zdhringer^, 
Ralf Reski* and Ernst Heinz^ * 

Wniversitat Hamburg, Institut fur Allgemeine Botanik, 
Ohnhorststr. 18, D-22609 Hamburg, Germany 
^Institut f. ZOchtung Landwirtschaftlicher Kulturpflanzen, 
Institutplatz h D-18190 Grofl Lusewitz, Germany 
^LG Immunchemie, Parkallee 1-40, 0-23845 Borstel, 
Germany, and 

^Albert'Ludwigs-Universitat Freiburg, Institut fur 
Biologie II, Schanzlestr. 1, D-79104 Freiburg, Germany 

Summary 

The moss Physcomitrella patens contains high levels of 
arachidonic acid. For its synthesis from linoleic acid by 
desaturation and elongation, novel A5- and A6- desaturases 
are required. To isolate one of these, PCR-based cloning 
was used, and resulted in the isolation of a full-length 
cDNA coding for a putatlvely new desaturase. The deduced 
amino acid sequence has three domains: a N-terminal 
segment of about 100 amino acids, with no similarity to 
any sequence in the data banks, followed by a cytochrome 
bs-related region and a C-terminal sequence with low 
similarity (27% identity) to acyl-lipid desaturases. To 
elucidate the function of this protein, we disrupted its 
gene by transforming P. patens with the corresponding 
linear genomic sequence, into which a positive selection 
marker had been Inserted. The molecular analysis of five 
transformed lines showed that the selection cartridge had 
been inserted into the corresponding genomic locus of all 
five lines. The gene disruption resulted in a dramatic 
alteration of the fatty acid pattern in the knockout plants. 
The large Increase in linoleic acid and the concomitant 
disappearance of Y-linolenic and arachidonic acid in all 
knockout lines suggested that the new cDNA coded for a 
A6-desaturase. This was confirmed by expression of the 
cDIMA in yeast and analysis of the resultant fatty acids by 
GC-MS. Only the transformed yeast cells were able to 
introduce a further double bond into the AG-position of 
unsaturated fatty acids. To our knowledge, this is the first 
report of a successful gene disruption in a multicellular 
plant resulting In a specific biochemical phenotype. 

Intr duction 

Compared to higher plants, many members of moss, algae 
and fern families produce a wider variety of polyunsatur- 
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ated fatty acids (PUFA; Dembitsky, 1993; Jamieson and 
Raid, 1975; Zhukova and Aizdaicher, 1995), and PUFA such 
as arachidonic acid (AA) and eicosapentaenoic acid (EPA) 
are produced only by lower plants. The function of these 
long-chain PUFA in the membranes of lower plants Is 
still unclear, whereas in humans, they play a key role in 
eicosanoid metabolism (Samuelsson, 1983). 

The biosynthesis of AA and EPA generally starts with 
linoleic acid (18:2), which is channelled into a widely 
branching network of desaturation and elongation steps 
(Arao and Yamada, 1994; Cohen era/., 1995; Shiran etaL, 
1996). Key enzymes In this network are A5- and A6-desatur- 
ases, which Introduce the new double bond between the 
first double bond and the carboxyl terminus of the fatty 
acid, known as carboxyl-directed desaturation. This mode 
differs from the methyl-directed desaturation, which works 
towards the methyl end of the unsaturated fatty acid. 
Desaturases of both types belong to the membrane-bound 
desaturases, which operate In microsomes or In plastids 
(Heinz, 1993). All desaturases, including acyl-ACP, 
(Ohirogge etaL, 1993), acyl-CoA (Enoch etaL, 1976) and 
acyl-lipid desaturases, are believed to catalyse an O2- 
dependent reaction, in which either cytochrome bs serves 
as electron donor for the microsomal or ferredoxin for the 
plastidial desaturases (Kearns etaL, 1991; Schmidt and 
Heinz, 1990; Smith etaL, 1990). 

In the last few years, extensive sequence information 
from various desaturases in the methyl-directed group has 
been accumulated, but only a few from the carboxyl- 
directed group (Reddy etaL, 1993; Sayanova etaL, 1997) 
have been cloned so far. A good source to clone new 
desaturases is the moss Physcomitrella patens. Lipids of 
P. patens contain high proportions of AA (up to 30% of 
total fatty acids) indicating strong expression of A5- and 
A6-desaturases (Grimsley etaL, 1981). This moss can be 
propagated vegetatively in the haploid state (Ashton and 
Cove, 1977), which simplifies the phenotypic analysis after 
mutation or transformation (Schaefer etaL, 1991). Genes 
of this organism can be specifically inactivated by gene 
targeting, as shown by Schaefer and Zryd (1997), who 
demonstrated that integration of homologous DNA into 
the genome of R patens takes place by homologous 
recombination with a relative efficiency of more than 90% 
among transgenic plants. 

In the present communication, we describe the isolation 
of a new cDNA and its corresponding genomic sequence 
from P. patens, using a PCR-based screening. The encoded 
protein shared less than 27% sequence identity with known 
desaturases and represents a fusion between a C-terminal 
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desaturase with a cytochrome bs-related part and a N- 
terminal extension. Its function and importance for the 
biosynthesis of AA (20:4) was identified by disrupting 
the corresponding gene in R patens. The biochemical 
phenotype of the null mutant and its subsequent comple- 
mentation by feeding y-linolenic acid nB:3^^'^''^^) demon- 
strated that the disrupted gene codes for a A6-desaturase, 
which plays a key role in the synthesis of 20:4. 

Results 

PCR'based cloning 

For PCR experiments, different sets of degenerate primers, 
deduced from the three conserved histidine boxes of acyl- 
lipid desaturases, were synthesized (Avelange-Macherel 
era/., 1995; Shanklin etaL, 1994). The template used was 
single-stranded cDNA from P. patens, which was reverse- 
transcribed from mRNA of 12-day-old protonema cultures. 
Bands of the expected length were cloned and sequenced. 
Data bank searches and alignments with these new 
sequences indicated similarities to acyl-lipid desaturases 
for seven cDNA fragments. Six of them were classified 
as putative members of the well-known A12- and A15- 
desaturases based on high identities of over 60%. In 
contrast to this, one sequence of 550 bp showed less than 
27% identity to known desaturases. Since Physcomitrella 
was expected to express A5- and A6-desaturases, it was 
postulated that this sequence might be derived from one 
of those desaturases. 

Isolation of a full-length cDNA 

To isolate a full-length cDNA clone, the 520 bp PCR frag- 
ment was DIG-labelled, and used to screen a cDNA library 
of 12-day-old protonemta. Of 3.0 x 10^ plaques screened, 
19 positives were isolated. The restriction analysis of their 
inserts showed a similar pattern in all cases. The partial 
sequence analysis from six inserts revealed that they were 
identical to each other within their overlapping regions 
and also to the original 520 bp PCR fragment. The longest 
insert, designated PPDES6 cDNA, was sequenced on both 
strands. It had a length of 2012 bp excluding its poly(A) 
tail. An open reading frame stretched from position 319- 
1894, and several stop codons in the corresponding 5' 
untranslated region indicated its full length (Figure 1). 
The protein PPDES6 translated from the PPDES6 cDNA 
contained 525 amino acid residues with a calculated 
molecular weight of 59.3 kDa. This is 7-20 kDa larger than 
all acyl-lipid desaturases known from higher plants and 
cyanobacteria. Data bank searches indicated similarity to 
cytochrome sequences from residues 105-176 and to 
desaturases from residue 207 towards the C-terminus. 
The desaturase domain showed the highest similarity to 



the cytochrome bs-containing fusion protein of Helianthus 
annuus (Sperling etaL. 1995), a putative fusion protein 
from Caenorhabditis elegans encoded by cosmid T13F2 
(281122) and the A6-desaturases of Spirulina platensis 
(X87094), Borago officinalis (Sayanova etaL, 1997) as well 
as Synechocystis sp. PCC 6803 (Reddy etaL, 1993). The 
identity values of PPDES6 to these proteins were low and 
ranged from 21% to 27% for the sequence between the 
first and third histidine boxes and from 12% to 23% over 
the entire length. The sequence motive QIEHH of the third 
histidine box started with a glutamine instead of a histidine, 
which has also been found in A6-desatu rases and the 
cytochrome bs fusion protein of H, annuus, but not in other 
membrane-bound desaturases. The hydrophobicity plot 
(Kyte and Doolittle, 1982) after residue 200 showed the 
typical profile of membrane-bound desaturases (data not 
shown). The cytochrome bs-related domain contained the 
eight invariant residues typical for the cytochrome bs 
superfamily (Lederer, 1994). 

The N-terminal extension of about 100 residues did not 
share significant similarity to any sequence in the data 
banks, and computer analysis did not detect any motives 
for protein targeting or modification either for the extension 
or for the whole protein. 

Structure of the gene 

To knock out the PPDES6 gene, its genomic sequence was 
amplified by PCR with specific primers C and D. Primer C 
was deduced from the 5' end and D from the middle of 
the 3' untranslated region of the PPDES6 cDNA. PCR with 
these primers and genomic DNA of P patens as template 
amplified a fragment that was 1578 bp longer than the 
distance between the binding sites of the primers on the 
cDNA. The genomic PCR fragment, denoted PPDES6, was 
cloned and sequenced on both strands (Figure 2). Apart 
from six putative introns (i1-i6) it was 100% identical with 
the cDNA, confirming its identity as the genomic locus of 
the PPDES6 cDNA. The 5' splicing border of five introns 
was GT and the 3' border of all six was AG. Only the fourth 
intron i4 contained the unusual 5' splicing border GC, 
which has been found in genes of several plant species 
(Xue and Rask, 1995). The reliability of this intron sequence 
was confirmed by sequencing two other PCR-amplified 
clones over this region. The intron i4 was located between 
two triplets coding for residues 176 and 177. After residue 
176 the detected similarity to cytochrome bs sequences 
was terminated. 

Gene targeting 

For the disruption experiments, the first histidine box of 
the genomic clone was replaced by the npt II gene as a 
positive selection marker. The subsequent double digestion 

© Blackwell Science Ltd, The Plant JournaL (1998), 15, 39-48 



NTCYTB5 

S¥Des6 

SPDes6 

B0DES6 

HAB5 

PPDES6 



20 



Identification of a /^6-desaturase by gene disruption 41 

40 * 60 * BO 



MVFAGGGLQQGSLEENIDVEHIASMSLFSDFFSYVSSTVGSWSVHSIQPLKRLTSITOVSESAAVQCISAEVQ^ 

► i2 Extension 



85 



NTCYTB5 

SYDesS 

SPDes6 

BODES 6 

HAB5 

PPDES6 



NTCYTB5 

SYDese 

SPDesG 

B0DES6 

HAB5 

PP0ES6 



100 * 120 

-MIXMq5ETtCVE*i3iAEVSg|&INAK^S^ 



140 • 160 * 




MAiscii^-;^. ^ . _ ,^ _ 

MV]5PSIEVUaSjgiDj^-jifjTSKil^^ « 5) 

► Cytochrome bs 

180 * 200 * 220 * 240 
EYSV|a03:DSA-TIPTKlltYTPPNQPHYNQDKT--SEFVVKIXQF^ 

TST^SKVTFGKSIGFBKEIJ^VN^iti^ 

l5^F^Y}^^|(^DI|iy^^pASE^j5;^^EfeGHGVIY^FV 

d^i^v^vepte|li^f|emrJ^^ 

^ » Desaturase 



73 
83 
168 



139 
85 
86 
156 
166 
251 



260 



280 



300 



320 



340 



NTCYTB5 

SYDesG 

SPDes6 

BODES 6 

HAB5 

PPDES6 



NTCYTB5 

SYDes6 

SPDes6 

B0DES6 

HAB5 

PPDES6 



V^EDfiNUaAY^NPHI^mVIi5^fltYD--FVeI^FIJmY1^^ j&VI^HiSDGAStRfeSPEQEHVGIYie^F 

J^^RGNJ GGYgKYCgVjrYLSSI^D--AI$V^YLB»E| ^VtBDE lYWliCGIJ evl^HgDEgR||C9PSMEYRWYHeYQHW 

YKgV^DSR^pl^FAAi^LS^Ip^^ ijl^fcV ^L^YIBE?V^[^F|Ggl|gH^ 



AiFj^c6QTYQPl|£^DT^l|w|KDl'^ 

2. Box 




I.Box 



360 



380 



400 



420 



WF|YfiVYp5N|iGK^^?HKIPPFC^I£I^SgBlK^ 



Y- 

F IWrvSPE^^SIAgVOTt^FfeolSpEI PSPTWpI§'n^FKA^A\^I I PIg^^SPIJEA^^EA^CVYM 



163 
164 
235 
245 
335 



240 
241 
317 
327 
409 



440 



460 



480 



500 



NTCYTB5 
SYDes6 

SPDesS 
BODES 6 
HAB5 
PPDES6 



NTCYTB5 

SYDes6 

SPDes6 



HAB5 
PPDES6 



fYplVvi^lgp^SBlVI^TEIELT^^^ 

^l|A|jgg^^Vl|pAEilJp^^- 1] 

^||^5^veKik| --^i^^ix^iiaMp^'^ ^ 

~ ^^^S^ !3NHE^K^a1|G^I^^S-l 

ISSKE- --FVS|^siR^lSj-Ij 





NICHIHYPQLENllK 
^TV^pSPHICHI^P^^A^LA 
BY? I 
ICR 

^ _ ^RVE 

3. Box 



540 



560 



VFg&^^|v|ED^^ lATGTCKV^A&^l^EAfi?^^ 



359 
368 
448 
458 
525 



325 
324 
395 
405 
484 



Figure 1. Amino acid sequences of PPDES6 and closely related proteins. 

Fc- alignment the CLUSTAL X program was used (gap opening 10, gap extension 0.05). Conserved and invariant residues are grey. The approximate 
beginning of the three domains from PPDES6 are marked by arrows and their putative function. The eight invariant residues characteristic for the cytochrome 
bo superfamily and the three histidlne boxes of the desaturase domains are framed. The underlined residues indicate the positions of introns i1-i6 in the 
genomic sequence PPDES6. SYDes6, SP0es6 and BODES6 refer to the A6-desatu rases of Synechocystls (U79010), Spirulina (X87094) and Borago (U79010). 
NTCYTB5 and HAB5 refer to the cytochrome bg of Nicotiana (X71441) and the bg fusion protein of Helianthus (X87143), respectively. 
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Southern blotting {see Figure 3) is marked with a line above the block. 



with Saul/SsfBI yielded a linear fragment with the npf II 
gene in Its centre and the desaturase arms at both ends 
(Figure 2). This linear fragment was used to transform P. 
patens protoplasts by the PEG method (Schaefer et al, 
1991). Seven transformation experiments with 3.0 x 10^ 
protoplasts In each experiment resulted in the isolation of 
56 independent and stably transformed lines. Five ran- 
domly selected transgenic lines IK1-K5) were used for 
detailed analysis regarding the molecular biology of gene 
disruption as well as its consequences for fatty acid biosyn- 
thesis. 

Molecular analysis of the transgenic lines 

The specific integration of the transformed DNA Into the 
PPDES6 gene was analysed by PCR using genomic DNA 
from five transformed lines {K1-K5) and the wild type. The 
locations of the different primers are presented in Figure 2. 
It is important to point out that the 3' end of primer 4 binds 
40 bp downstream of the cloned genomic sequence to 
exclude PCR signals resulting from contamination by the 
DNA used for transformation. Its sequence was derived 
from the 3' end of an incomplete cDNA clone, which 
showed the same sequence in the overlapping region with 
cDNA PPDES6, but contained a longer 3' end. 

PCR with the primer pair 1/2 amplified fragments of 2.7 
kbp, and with the primer pair 3/4 bands of 1.6 kbp, from 



all five transformants, whereas experiments with the wild 
type gave negative results. The length of the bands agreed 
with a substitution of the first histidine box of the PPDES6 
gene by the npt II cassette. Both PCR fragments from 
two transformants (K2 and K3) were cloned and partially 
sequenced. The sequenced segments were identical with 
the corresponding regions of the transformed gene disrup- 
tion construct. Most important, the fragments from primer 
pair 3/4 contained the downstream genomic element of 
40 bp, which was absent in the transformed DNA. They 
lacked the first histidine box, and the transition regions of 
the npt II cassette to the PPDES6 gene, as well as the 
regions containing the restriction sites Aat II and Hpa 
I, were identical in their sequence with the disruption 
construct. 

To provide evidence for a deletion of the first histidine 
box in the PPDf S6gene of the transgenic lines, the genomic 
DNA of the transformed lines and the wild type was 
digested with BglW, blotted and hybridized with the DIG- 
labelled deletion probe Del. This probe represents the SauM 
BstB\ fragment encoding the first histidine box, which had 
been deleted from the transformed disruption construct 
{Figure 3). Hybridization with the deletion probe Del 
showed one strong signal of 4.5 kbp and two very weak 
signals of 5.0 and 7.0 kbp with the wild type DNA. The 
transformed lines K1-K4 had lost the strong 4,5 kbp signal 
but not the two weak signals. Line K5 corresponded to the 
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Rgure 3. Verification of gene disruption by Southern blotting. 
Genomic DNA (4 \ig) from the wild type (WT) and five transformed lines 
K1-K5 (1-5) was digested with Bgl\\ and hybridized with the deletion probe 
(Del). The location of the probe is described In Figure 2. Molecular weights 
in kbp are indicated on the right. 

wild type situation but contained an additional band of 
more than 21 kbp. 

To compare the expression of PPDES6 in the five trans- 
genic lines with the wild type, we blotted total RNA of 14- 
day-old protonemata and hybridized it with a DIG-tabelled 
RNA probe against the 3' end of the PPDES6 cDNA 
^Figure 4). The wild type showed a strong signal of 2.0- 
2.2 kb, whereas the five transgenic lines had lost this 
transcript. Hybridization with a npt ll-specific probe (blot 
not shown) detected a strong signal of 1.0-1.3 kb in ad 
transgenic lines but not in the wild type. 

Functional analysis of PPDES6 in P. patens 

For the functional identification of the desaturase, we 
analysed the total fatty acids of the wild type and the five 
knockout lines. The fatty acid analyses presented in Figure 5 
are confined to the wild type and to line K2, but the 
other four lines tested gave essentially the same results. 
Pathways [1) and [2] below show the sequences proposed 
for the biosynthesis of AA (20:4) and EPA (20:5) in R patens, 
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Figure 4. Northern blot analysis of PPDES6 expression. 
Total RNA (ZOjig) from 14-day-old P. patens protonemata was probed with 
an RNA probe transcribed from the last 600 bp of the PPDES6 cDNA. Five 
transgenic lines K1-K5 (1-5) and the wild type (WT) were analysed. 
Molecular weights in kb are Indicated on the right 

and they are supported by our results (fatty acids are 
indicated as m:n^^'^'^--; m refers to the number of carbon 
atoms, n to the double bonds and ^a.b,c.... position 
of the double bonds; desaturation and elongation steps 
are indicated by Ax and EL). 

.'jg.2A9.i2.is ■jpMA6.9.i2.i5 20:4^' "'"'"-^f^ 20:5^^-^"-"-" [2] 

Compared with the wild type, all transgenic lines showed 
a strong decrease in those unsaturated fatty acids, the 
formation of which involves a A6-desaturation step 
(Figure 5): ISiS^^'^'^^ 18:4^6.9.12,15 20:3^'"'^^ 20:5^5.8.ii.i4.i7 
and most clearly 20:4^^-^'''^ On the other hand, the 
possible substrates for a A6-desaturase, ^8:2^^''^^ and 
13.3A9,i2.i5 increased. Therefore, it is most likely that the 
reactions from 18:2^^'^2 i8:3^6,9,i2 gg ^q\\ gs from 
18:3^9,12,15 13.4^6.9,12.15 yyere blocked, both of which 
are catalysed by a A6-desaturase (compare pathways [1] 
and [2]). 

To provide further evidence for the function of the new 
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Figure 5. Fatty acid profiles of the P. patens wild type (WT) and the knockout 
line K2. The fatty acid methyl esters (FAME) of the total lipids were analysed 
by capillary gas-liquid chromatography. The chromatograms WT and K2 
show the FAME of protonemata grown for 14 days in liquid medium. The 
lower chromaiogram shows the FAME profile of K2 cells cultured under 
the same conditions but in the presence of SOjim of y18:3 (18:3-^' ). 



A6-desaturase, we supplemented the knockout line K2 and 
the wild type with ISiS^^-^'^^ (yisiS). In K2 the feeding of 
this fatty acid resulted in the reappearance of 20:3^® "'^'* 
and 20:4^^'® "^'*, whereas almost no change was observed 
in the wild type. This experiment indicates that the knockout 
line K2 is able to synthesize 20:4 from added 18:3"''^'^'^^ 
but not from 18:2^^-^^ which increases in unsupplemented 
K2. However, the addition of 18:3^^'^'^2 did not result in a 
complementation of the almost complete disappearance 
of 20:5^5.8.11.14,17 in K2. 

The addition of 20:2^i^'''* and 20:3^^^-^''^^ (data not 
shown) did not result in an increase of 20:4 and 20:5 in the 
wild type or in K2. Another interesting effect of the knockout 
was the completely different proportion of C20-fatty acids 
in K2 (7%) compared to the wild type (30%). 

Functional expression of PPDES6 in Saccharomyces 
cerevisiae 

To exclude the possibility that the loss of a A6-desaturase 
in the knockout lines is a consequence of a regulatory 
difference between the Physcomitrella wild type and 
knockout lines, PPDES6 was functionally expressed in 
Saccharomyces cerevisiae. Plasmid pYESA6 containing the 
open reading frame of the PPDES6 cDNA was transformed 
into the S. cerevisiae strain INVSCl. One clone transformed 
with pYESA6 and another with the empty vector pYES2 as 
control were grown for four to five generations after 
induction with 2% galactose in minimal medium. Since S. 
cerevisiae does not contain the dienoic fatty acid substrates 
required for a A6-desaturase, the expression was per- 
formed with supplementation of 18:2^'^2 gnd IS.S'^'^^'^^ 
respectively. In subsequent analyses of total fatty acids, 
the following A6-desaturated products were detected in 
the strain expressing PPDES6: 16:2^^.9^ 18:2^^^ ^S:3^^'^''^ 
and 18:4^^'^'''2'^^ (Table 1). In the control cells, none of these 
fatty acids were detected. The production of these fatty 
acids with an additional A6-double bond confirmed that 
cDNA PPDES6 encodes a A6-fatty acid desaturase. 

Discussion 

Structural properties 

The cDNA and the genomic sequence PPDES6 encoding a 
novel A6-desaturase from P patens were cloned using a 
PCR-based approach. The deduced protein shared less 
than 27% identity with the recently cloned A6-desaturase 
from B. officinalis and with the A6-desaturases from cyano- 
bacteria (Reddy et al.. 1993; Sayanova et al.. 1997). This is 
a surprisingly low value, as until now all desaturases 
of the same regioselectivity and the same subcellular 
compartment have been more highly conserved, even 
between distantly related organisms. For example, six 
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Table 1. Expression of the A6-desaturase in 5. cerevisiae. The fatty 
acid methyl esters of the total lipids from cells transformed with 
pYES2 (WT control) and pYESA6 (A6-desaturase of P, patens) were 
analysed by GLC. The cells were cultured in minimal medium 
supplemented with 2% galactose for 24 h at ZO^'C. The last two 
columns show data from cultures supplemented with 18:2^'^^ 
(18:2) and 18:3^'"'^5 (q18:3) 



% total fatty acids 
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Other PGR fragments fronn R patens, isolated in this screen- 
ing, coded for putative A12- and A15-desaturases and 
displayed more than 60% identity to the corresponding 
desaturases of higher plants and cyanobacteria. 

The presence of the cytochrome bs-related domain 
upstream of the desaturase suggests its localization in 
microsomes rather than in chioropiasts, because plastidia! 
desaturases normally use ferredoxin as electron donor 
(Heinz, 1993). Besides this, PPDES6 contains a new 
N-terminat extension of about 100 amino acids, which is 
absent in other presently known desaturases. The function 
of this extension is unclear, since it shows no significant 
homology to any known protein, and targeting or modifica- 
tion signals were not detected. Interestingly, the three 
histidine boxes and the cytochrome domain of PPDES6 
are encoded by separate exons (Figure 2), implying that 
they may constitute separate evolutionary units. The fourth 
intron containing the unusual 5' splicing border GC is 
located directly after the last triplet for the cytochrome bs 
domain. This organization could allow a differential splicing 
between the 5' border of the first and the 3' border of the 
fourth intron, resulting in a deletion of both the cytochrome 
bs domain and the N-terminal extension from the desatur- 
ase domain of the PPOESe transcript. 

Moiecular anaiysis ofttie transgenic iines 

In this study, we have described the highly efficient 
knockout of the PPDES6 gene after transforming P patens 
with a linear disruption fragment PGR experiments proved 
the specific integration of the npt II cassette into the 
PPDES6 locus in all arbitrarily chosen transgenic lines. 
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Furthermore, Southern blot experiments confirmed the 
deletion of a 200 bp segment encoding the first histidine 
box from the genome of four transgenic lines (K1-K4). It 
is likely that reciprocal exchange by double cross-over led 
to the integration observed in these four lines. Targeting 
experiments from Schaefer and Zryd (1997) demonstrated 
homologous integration into a locus but not a substitution. 
The blots with line K5 reveal an even more complicated 
situation. Nevertheless, K5 does not express the A6-desatu- 
rase activity any more. Two additional signals of low 
intensity in wild type and in all transgenic lines indicated 
that related genomic sequences were not involved, in the 
gene targeting events. The presence of these sequences 
suggests that isoforms of other A6-desatu rases could be 
expressed to some extent in the knockout lines. 

In the Northern blots all transgenic tines showed a 
dramatically reduced expression of PPDES6 while this 
transcript was abundant in the wild type. Thus loss of 
desaturase activity, as evident from the fatty acid profiles 
most probably resulted from loss of transcription due to 
gene disruption. 

Functional analysis ofPPDESG in P. patens and S. 
cerevisiae 

The gene disruption of PPDES6 resulted in a dramatic 
alteration of the fatty acid pattern in the transformed lines. 
The knockout lines showed an increase of 18:2 and a18:3 
and a decrease of A6-desatu rated fatty acids. Therefore, it 
is likely that PPDES6 codes for a A6-desaturase, which 
desaturates 18:2^'''2 to ISiS^^'^'^^ gnd ^S:3^''^^'^^ to 
13.4A6,9.i2,i5 A6-regioselectivity of PPDES6 was further 
verified by restoration of 20:4 biosynthesis upon feeding 
of y18:3 (Figure 5). The synthesis of 20:4 from y18:3 would 
not work if a A5-desaturase or the elongation system had 
been blocked. The A6-desatu ration of 18:2 and a18:3 added 
to S. cerevisiae cells expressing PPDES6 confirmed these 
results and excluded the possibility that the loss of a 
A6-desaturase in the knockout lines was due to regulatory 
alterations, for example the loss of an activator for the 
A6-desaturase. On the other hand, we could not detect a 
A8-C20-desaturase in P. patens, since addition of 20:2^"'''^ 
and 20:3^"'^"^'^^ did not increase the content of 20:4 and 
20:5. A A8-desaturase operating at the G20-level could 
theoretically replace the A6-C18-desaturase in the biosyn- 
thesis of 20:4 and 20:5. Such an enzyme has been suggested 
to be present In Euglena gracilis (Nichols and Appleby, 
1969). 

Based on the knockout effects and feeding experiments, 
we propose the two pathways |1] and 121 mentioned above 
for the biosynthesis of 20:4 and 20:5 in P. patens, which 
branch at 18:2. They are in agreement with the biosynthesis 
of 20:4 and 20:5 as suggested for Porphyridium cruentum 
(Shiran et al., 1996), 
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It should be noted that S. cerevisiae cells expressing 
PPDES6 produced not only 18:3"^fi'3 " and 18:4^6.9. 12. is ^ut 
also 16:2^^'^ and 18:2^^'^, which were not detected in P. 
patens. The reason for their absence in P. patens nnay 
be the low content and rapid turn-over of the putative 
precursors, 16:1^^ and 18:1^^ in the moss, whereas they 
are produced in high amounts by S. cerevisiae. Since the 
A6-desaturase converts 16:1^^ to 16:2^^^ but does not 
introduce a A8-double bond into 20:2'^^^'^^ and 20:3^ii'i^'^^ 
(mentioned above), the insertion of the A6-double bond 
involves measuring from the carboxy terminus {and the 
A9-double bond) rather than from the methyl end. This 
classifies the desaturase as a A6-desaturase (Heinz, 1993). 

Another interesting effect is the significant decrease in 
C20-fatty acids in the knockout lines. The decrease from 
more than 30% in the wild type to less than 7% in K2 
indicates that the elongation system of P. patens prefers 
or even requires A6-desatu rated C18-fatty acids. This 
elongation process is either very rapid or channelled and 
thus prevents the accumulation of y18:3 or 18:4 in lipids. 
In the other organisms, from which A6-desatu rases have 
been cloned (fi. officinalis and Synechocystis). eiongation 
systems do not co-operate with this desaturase and there- 
fore A6-desaturated fatty acids can accumulate. A detailed 
analysis of lipids and fatty acids in P. patens wild type and 
knockout plants, as well as in S, cerevisiae expressing the 
A6-desaturase, will be published elsewhere (T. Girke etal., 
manuscript in preparation). 

In our present study, all knockout lines still contained 
small amounts of fatty acids, which were synthesized by 
a pathway requiring A6-desaturase. This indicates that at 
least one other functional gene for a A6-desaturase should 
exist. Possible candidates may be the two faint signals 
observed above the targeted 4.5 kbp fragment in Southern 
blots of wild type and transgenic lines (Figure 3). 

Apart from these biochemical changes, we did not detect 
any visibly altered phenotype in the knockout plants, at 
least in their protonema or gametophore states at 25°C. 
Therefore, it was not possible at this point to evaluate 
the physiological importance of 20:4 for the moss. The 
appearance of a visible phenotype may also be prevented 
by residual 20:4. Deletions of several desaturases in Syne- 
chocystis became critical only if the A6- and A12-desaturase 
were knocked out together, whereas a reduction in trienoic 
acids without affecting dienoic acids was not critical (Tasaka 
etal., 1996). 



Experimental procedures 

Plant material and culture conditions 

The protonemata of Physcomitrefla patens (Hedw.) BSG were 
grown in liquid medium (Reski etal., 1994). For feeding experi- 
ments with fatty acids, 4-day-old cultures were supplemented 



with ammonium salts of fatty acids (dissolved in ethanol) to a 
final concentration of 50 \iM and further cultivated for an additional 
6-8 days. 

Analysis of nucleic acids 

DNA manipulations were performed according to standard proto- 
cols (Sambrook etal., 1989) unless othenwise stated. DNA 
sequences were determined on both strands by the dideoxy chain 
termination method using Dye Primer as well as Dye Terminator 
sequencing kits. 



PCR with degenerated primers and cDNA library 
screening 

Poly(A)* RNA was isolated with Dynabeads (Dynal, Oslo, Norway) 
from total RNA of 12-day-old P. patens protonema cultures, and 
reverse-transcribed into single-stranded cDNA. This ss-cDNA was 
used as template in the PCR-based cloning. A 550 bp PCR fragment 
was amplified with the degenerate sense primer A 5'-TGGTGGAA 
(A/G)TGGA(C/A)ICA{T/C)AA-3' and antisense primer B 5'-GG 
(A/G)AA(A/T/G/C)A(A/G)(G/A)TG(G/A)TG(C/r)TC-3' derived • from 
the amino acid sequence WWKW (N/T)HN and EHHLFP, respect- 
ively The PCR reactions were carried out with Taq DNA polymerase 
using an amplification programme of 3 min denaturation at 94'*C, 
followed by 30 cycles of 20 sec at 94X, 30 sec at 45X, 1 min at 72*»C 
and terminated by 5 min extension at 72X. The PCR fragments of 
the expected length (500-600 bp) were cloned in pUC18 and 
sequenced. A digoxygenin-labelled DNA probe of the PCR frag- 
ment was synthesized by PCR and used to screen a lambda 
ZAPII cDNA library of 12-day-old protonemata according to the 
manufacturer's protocols (Boehringer, Mannheim, Germany; 
Stratagene, La Jolla, CA). The longest insert {PPDES6 cDNA) was 
sequenced on both strands using overlapping subclones. The 
corresponding genomic sequence PPDES6 was isolated by PCR 
with specific primers C (5'-CCGAGTCGCGGATCAGCC-3') and D 
(5'-CAGTACATTCGGTCATTCACC-3') using the Expand High 
Fidelity PCR System (Boehringer) and the hot start PCR program 
described below. PPDES6 was cloned into the pCR-Script Amp 
SK(+) cloning vector (Stratagene), resulting in plasmid pPPDES6 
and sequenced on both strands. 



Transformation of P. patens 

First the vector pRTIOIneo was constructed to obtain a npt II 
selection cassette, which could be excised by H/ndlll digestion. 
For this purpose the npt II coding region of pRTlOOneo (Topfer 
etaL, 1993) was excised with H/ndlll (blunted)/X/7oI and ligated 
between the CaMV 35S promoter and terminator of pRTIOl (Topfer 
etai, 1987), which had been digested with Xba\ (blunted)/X/?ol. 
The gene disruption construct resulted from the substitution of a 
Sau\/BstB\ fragment in the genomic clone pPPDES6 by the npt II 
selection cartridge. Subsequently, the disruption construct was 
digested with Aat\\ and Hpal. resulting in a linear fragment with 
the npt II gene in its centre flanked by genomic sequences of 
923 bp and 1159 bp. Fifteen micrograms of this linear DNA were 
phenol extracted, precipitated and used for the transformation 
without separation from the vector. PEG-mediated direct DNA 
transfer into protoplasts was performed as described by Schaefer 
etal (1991). The regenerated protonemata were selected for 
14 days on medium with G418 (50 mg r^), released for 12 days 
under non-selective conditions and again grown for 14 days on 
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selection plates. Well growing plants surviving this selection 
regime were defined as stable transformants and cultivated for 
mass production in non-selective liquid mediunn. The stability of 
their G418 resistance was tested every 4 weeks by incubating 
aliquots on selection plates. 



experiments, the cultures were grown to an optical density 
(600 nm) of 0.5 in CMdum medium, then supplemented with 2% 
galactose (w/v) as well as 0.003% of the corresponding fatty acid 
(w/v; stock solution solubilized in 5% tergitol) and finally grown 
to saturation for 24 h at 30*C. 



DIG'labelling of DNA and RNA 

DNA probes were labelled with digoxygenin by PGR (PGR DIG 
probe synthesis kit; Boehringer). The 5' ends of the primers for 
the deletion probe (Del) were located on the PPDES6 cDNA at 
position 910 and 1092 (Sat/I/Bsffll fragment). The desaturase RNA 
probe was transcribed by in vitro transcription with digoxygenin 
(Boehringer) from a subclone of the PPDES6 cOHA containing the 
last 600 bp of its 3' end and the npt II probe from a subclone 
coding for the npt II. 



PCR detection. Southern and Northern blot analysis 

Four primers were used in the PGR experiments for the detection 
of gene targeting events. Primer 1 was derived from the 5' end 
of PPDES6. Primers 2 and 3 were constructed from the ends of 
the npt II coding region. The sequence of primer 4 (5'- 
CAGAGACGAATCGTGGCTCC-3') was derived from the 3' end of 
an incomplete cDNA clone, which was identical with PPDES6 
cDNA in the overlapping region, but contained a longer 3' end. 
The PGR experiments with these primers were run with a hot 
start programme of 10 min denaturatton at 94°C, addition of the 
polymerase at 72*C, followed by 30 cycles of 30 sec at 94**C, 30 sec 
at WC, 3 min at 72*C and terminated by 10 min extension at 
72'C. Genomic DNA of P. patenswas extracted with cetyl-tri methyl- 
ammonium bromide according to Rogers and Bendich (1988). 
Four micrograms of DNA were digested with the appropriate 
restriction enzyme, separated on a 0.7% agarose gel by electro- 
phoresis, transferred onto a nylon membrane and hybridized. The 
final washing steps were performed in 0.5 x SSG with 0.1% SDS 
at eS^C. The detection was accomplished with a chemiluminescent 
substrate (GSPD, Boehringer). The Northern blot experiments were 
performed with total RNA isolated from 14-day-old protonema 
cultures (RNeasy plant kit, Qiagen, Hilden, Germany). Twenty 
micrograms of total RNA were separated on a standard formalde- 
hyde gel, blotted onto a nylon membrane and hybridized with 
RNA probes. The final washing steps were performed in 0.1 x SSG 
with 0.1% SDS. 



Expression in S. cerevisiae 

The open reading frame of the PPDES6 cDNA was cloned behind 
the galactose-inducible promoter GAL1 of the yeast expression 
vector pYES2 (Invitrogen, Leek, Netherlands). For this purpose, a 
new Xhol site was introduced by PGR (32 bp upstream from its 
deduced translational start at position 319). The entire open 
reading frame of the desaturase was released with H/ndlll 
(blunted)/X^ol and ligated into the Xbal (blunted)/XA?ol sites of the 
pYES2 vector to yield plasmid pYESA6. Its sequence was verified 
by DNA sequencing. The plasmids pYESA6 and pYES2 were 
transformed into the Saccharomyces cerevisiae strain INVSG1 
(Invitrogen) by the lithium acetate method (Ausubel etaf., 1995). 
Cells harbouring the plasmids pYES2 and pYESA6 were grown in 
complete minimal drop-out uracil medium (CMdum) containing 
2% raffinose as the exclusive carbon source (Ausubel etaL 1995; 
Kajiwara ef a/., 1996) and 1% Tergitol NP-40 (w/v; Sigma) for the 
solubilization of fatty acids (Avery etal., 1996). For expression 



Lipid analysis 

Lipids were extracted from protonemata and yeast cells by chloro- 
form-methanol extraction (Siebertz et ai, 1979) and purified from 
apolar components by TLG in diethylether. In this solvent all 
membrane lipids (triacylglycerols were not produced by protone- 
mata) remained at the start. The fatty acid methyl esters (FAME) 
were obtained by transmethylation of the lipids with 1 N H2SO4 
in methanol and 2% dimethoxypropane at 80°G for 1 h. The 
extracted FAME were analysed by gas-liquid chromatography 
using a capillary column (Ghrompack, WGOT Fused Silica, CP- 
Wax-52 GB, 25 m, 0.32 mm). Their identities were confirmed 
by comparison with appropriate FAME standards (Sigma). The 
corresponding fatty acid pyrrotidides were obtained as described 
elsewhere (Andersson and Holman, 1974) and analysed by GLC- 
MS on a HP 5989 A instrument (Hewlett-Packard) equipped with 
an HP-5 column using a temperature gradient ISOT (3 min) -> 
320''C at m\n'\ Electron impact (El) was carried out at 70 eV 
and chemical ionization mass spectra (CI-MS) were recorded with 
ammonia as reactant gas (0.1 MPa). 
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V. \ Iinolenic acid in higher plants. — Plant Physiol. Biochem. 26:777-792. 

I' . The biosynthesis of polyunsaturated fatty acids in higher plants is reviewed 

V ' ' ■ with particular emphasis on linolenate biosynthesis. Much information has 

|. •■ been gained concerning Iinolenic acid synthesis by following the fate of 

|, . : radiolabelled precursers in vivo. Linolenate synthesis apparently occurs in both 

Vr--:''- * the chloroplasts on galactolipids and the endoplasmic reticulum on phospho- 

lipids. Linoleate desaturation can be differentially affected by chemical 
modulators and environmental conditions such as temperature, light and water 
' stress relative to fatty acid biosynthesis resulting in changes in the linolenate 

' ; ■ content of lipids. Progress on the biochemical characterization of linoleoyl 

■ desaturase has been hampered by the apparent instability of the enzyme and 

' the lack of a good in vitro assay system. Progress has been made in the breeding 

of plants for altered seed linolenate content (and other fatty acids) and a 
number of mutants have been found with altered linolenate levels of seed 
lipids and some of leaf lipids. Many of these mutants involve only one or two^ 
genes and therefore should be very useful in the biochemical and molecular 
characterization of linolenate biosynthesis in higher plants. The prospects for 
* . the genetic engineering of plants for altered fatty acid composition are 

discussed. 



Additional key words — Lipids, fatty acids,.polyunsaturated, oils, desaturation. 
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; 'Resiune..;^!^ biosynthese des acides gras polyinsatures chez les plantes superieures est passee en revue en insistant 
p spicialernent^ sur la biosynthese. du linolenate. De nombreuses informations ont ete obtenues sur la synthese de Vacide 
. fiholenique en suivant in vivo le devenir de precurseurs marques. Apparemment, la synthese du linolenate a lieu a la fois 
'■X : dims les chloroplastes au niveau des galactolipides et dans le reticulum endoplasmique au niveau des phospholipides. La 
'i^ieMiiratipn du linoleate peut etre electee de differentes manieres par des agents chimiques ou des conditions de 
Venvironnement; telles que la temperature, la lumiere, le stress hydrique, qui agissent sur la biosynthese des acides gras 
etdont leresultat est une modification du contenuen linolenate des lipides. Les progres dans la caracterisation biochimique 
: d&la desatw de Vacide lirioleique ont ete entraves par I'apparente instabilite de Venzyme et Vabsence d'un bon test 
d'activite in vitro. Des progres ont ete faits dans la selection de plantes dont le contenu des graines en linolenate {et en 
dartres acides gras\ a ete modifie, et des mutants preSentant des teneurs modiflees en linolenate dans les lipides des graines 
^k/et nieme des feuilles ont ete obtenus. La plupart de ces mutations ne concernent qu'un ou deux genes: elies devraient. 
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done etre tres utiles pour la caracterisation biochimique et moleculaire de la biosynthese du linolenate chez les plantes 
superieures Us perspectives ouvertes pour la production de plantes ayant des compositions en acides gras modifiees sont 
discutees. Mots cles additionnels : lipides, acides gras, huiles polyinsaturees, desaturation. 

Abbreviations. ACP, acyl carrier protein; Ch, cholines CoA, coenzyme A; DAG, diacylglycerol; CDPCh, cytidine 
diphosphoryl choline; EMS, ethylmethane sulfonate; ER, endoplasmic reticulum; G3P, glycerol 3-phosphate- LPA ' 
lysophosphatidic acid; MGDG, monogalactosyldiacylglycerol; PA. phosphatidic acid; PC, phosphatidylcholine- PLTp' 
phospholipid transfer protein; TAG, triacylglycerol. C16:0, C18:3, etc., denote number of carbon atoms and double 
bonds. Pairs of numbers denotmg fatty acids and separated by a slash (virgule) for example 18:2/18:3, represent the 
components at the sn-\ and s«-2 positions in PC, respectively. Sn-l and sn-2 represent the first and second (middle) 
positions on the glycerol backbone of lipids. . 



Introduction 

Photosynthetic tissues of higher plants typically 
contain 60-70% of linolenic acid (18:3) which is the 
most abundant fatty acid in nature (Gounaris and 
Barber, 1983). The presence of high levels of poly- 
unsaturated fatty acids in plants has been implicated 
as playing a role in maintaining membrane ifluidity 
of the photosynthetic apparatus and in preventing 
chilling damage (Raison, 1980; Oquist and Liljenberg, 
1981; Harwood, 1983; Quinn and Williams, 1983: 
Kuiper, 1984). 

Linolenic acid is a constituent of seed pO fatty 
acids-in a number of oilseed crops such as soybean, 
rapeseed, and linseed. The quality of a seed oil is' 
primarily dependent upon its fatty acid composition, 
which also determines its end use. Soybean oil 
contains 8% linolenic acid while other oilseed crops 
such as sesame, cottonseed, and sunflower contain 
, less than 2%. The relatively high 18:3 level in 
soybean oil is not desirable for its use as a cooking 
oil : due to its inverse correlation with oxidative 
stability and flavor quality (Smouse, 1979). Com- 
.mercial soybean oil is a product of refining and 
industrial hydrogenation of the polyunsaturated 
.fatty adds in the seed oil, which reduces the level 
of 18:3 and other unsaturated fatty adds. This 
expensive process also generates isomers of unsatu- 
rated fatty adds which are of concern in human 
health. Therefore, lowering the 18:3 content in 
soybean seed oil has been endeavored in several 
laboratories (Howell et ai, 1972; Tremolieres et al, 
1978, 1982; Hammond and Fehr, 1983; Carver and 
Wilson, 1984; Wilcox etal., 1984). However, progress 
iii the development of commerdal cultivars with 
lower 18:3 content has been slow. 
: A major factor contributing to the slow progress 
is the poor understanding of the biosynthesis and 
regulation of linolenic add in both leaf and seed 
tissues (Stumpf, 1980; Frentzen, 1986). The forma- 
tion of linolenic acid is considered to occur via 
consecutive desaturations of stearic, oleic, and lino- 



leic adds with each step being catalyzed by a 
different enzyme (Tremolieres and Mazliak, 1974; 
Cherif et al.,- 1975; Slack et a/., 1978; Roughan et al] 
1979 a; Stymne and Stobart, 1985). However, httle 
success has been achieved in the attempt to assay 
Unoleoyl desaturase activity in vitro, which hampers 
further isolation and biochemical study of this 
enzyme. Further, it is not yet fully established which 
lipids are substrates for desaturation; how many 
distinct desaturases exist; \yhether 18:2 to 18:3 
conversion occurs outside as well as inside the 
chloroplast; and how the formation of 18:3 is 
biochemically and genetically regulated. The present 
review discusses the current understanding of certain 
aspects of the biosynthesis and regulation of lino-, 
lenate in higher plants. 

Biosynthesis of polyunsaturated fatty acids 

Desaturation of CIS fatty acids 

The processes in the synthesis of plant polyunsa- 
turated fatty acids from acetate are well understood 
up to the step of production of oleic add (fig. 1). 
Enzymes involved in the formation of saturated 
fatty acids as far as stearic add are all soluble, 
residing in the stromal phase of plastids (Stumpf, 
1980). . K V F,. 

The synthesis of oleic, linoleic, and linolenic acids 
in higher plants is thought to occur through conse- 
cutive desaturations from stearic acid (Roughan et 
a/., 1979 a; Jaworski, 1987). The first step of desatu- 
ration from stearate to oleate is catalyzed by stea- 
royl-ACP desaturase (Nagai and Bloch, 1968) and 
occurs in the stromal phase of chloroplasts.* Unlike 
all other known desaturases, this desaturase is 
soluble instead of membrane bound and was the 
first plant desaturase to be studied in great detail 
in plants (Gurr, 1974). This enzyme uses stearoyl- 
ACP as the substrate and yields oleoyl-ACP as the 
product (Stumpf and Porra, 1976; Ohlrogge et al, 
1978). The reaction requires an NADH ferredoxin 
reductase, ferredoxin, as well as the desaturase; and 
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Figure 1, A scheme showing the biosynthetic pathway of polyunsa- 

fturated fatty acids .according to the references given in . text. The 
enzymes cataljrzing individual reactions are: (7), acetyl-CoA 
1^ synthetase; (2), acetyl-CoA carboxylase; (i), fatty add synthe- 
itase; (-Oj fatty add elongase; (5), CI 8:0 desaturase; lyso- 
r phosphatidylchoUnetransferase; (7), CI 8:1 desaturase; (8) and 
: (9), C18:2 desaturase(s). 



'j^3J]j^ inhibited by cyanide (Nagai and Bloch, 1968; 
Jj|Wor$^^ Stumpf; 1974). In plants, lerredoxin 
acts as an intermediate electron carrier, transporting 
yRwo ^lectix)ns from NADH or water to the double 
Intend being acted on by the desaturase. 
v^: Stearoyl-ACP desaturase has been isolated and 
l^urified from developing safflower seeds. It was 
^Jjgiund to be a dimer with a molecular mass of 
^S^kDa, it required 400 nM oxygen for maximal 
l&tivity. and was stimulated several fold by catalase 
l^cKeon and Stumpfi 1982). In developing seeds as 
^^rylesiye^ the desaturatipn of stearic acid to oleic 
iSapparently occurs in plastids (Stumpf^ 1980). 
^nie^ elucidation of the biosynthesis of two polj^- 
^^nsatiirated fatty acids, namely linoleate and lino- 
^Ignate, is based mostly on intact tissue investigation 
jor^experiments with membrane fractions such as 
microsomes. Attempts to obtain systems which can 



be purified and thereby fully characterized have been 
unsuccessful, but desaturation of oleic to linoleic 
acids likely occurs outside the chloroplast in both 
photosynthetic and non-photosynthetic tissues 
(Roughan and Slack, 1982). Studies have demonstra- 
ted that intact chloroplasts only synthesize oleic 
acid, although small amounts of linoleic and linole- 
nic synthesis have been noted (Heinz et aL, 1979). 
The resulting oleic acid in chloroplasts is transferred 
to the cytoplasm and esterified to phosphatidylcho- 
line (PC), which acts as the substrate for the 
desaturation to linoleic acid. The oleoyl-PC desa- 
turase is membrane bound and localized in the 
endoplasmic reticulum (ER) (Abdelkader et aL, 1973; 
Dubacq et aL, 1976; Slack et aL, 1976; Tremolieres 
et aL, 1980 a). It requires NADH and O2 for activity 
and it is inhibited by cyanide (Stymne and Appel- 
quist, 1978). In developing seeds, the activity of this 
enzyme is relatively easily measured in crude cell-free 
homogenates and microsomal fractions (Stymne and 
Appelquist, 1980). The desaturation of oleoyl-PC in' 
isolated niicrosomes from young pea leaves was 
found to occur predominantly on the sn-2 position 
of PC (Murphy et aL, 1985); whereas, the desatura- 
tion of oleoyl-PC in isolated potato tuber micro- 
somes was found to occur on both positions although 
again mostly on the sn-2 position (Demandre et aL, 
1986). However, the isolation and purification of this 
enzyme has not yet been achieved. 

Information concerning the desaturation of lino- 
leate to Hnolenate is currently very limited. The 
microsomal desaturation product, linoleoyl PC, is 
transferred back to the chloroplast. A purified PLTP 
has been demonstrated as being capable of carrying 
phospholipids from microsomes to intact chloro- 
plasts (Tremolieres et aL, 1980 b: Drapier et aL, 1982; 
Ohnishi and Yamada, 1982; Dubacq et aL, 1984; 
Grechkin et aLy 1984). There has been considerable 
discussion as to whether the substrates of. linoleoyl 
desaturase are phospholipids, fatty acid CoA forms, 
or galactolipids (Stumpf, 19S0; Roughan and Slack, 
1982). In plant leaf tissues, both galactolipids and 
phospholipids are thought to serve as the substrates 
(Roiighan and Slack, 1984; Williams et aL, 1983); 
while in developing seeds, phosphatidylcholine is 
believed to be the substrate (Slack et aL, 1979). 
Linoleoyl desaturase resembles other known desa- 
turases in that NADH and O2 are required for 
activity and its activity is inhibited by cyanide. This 
enzyme is considered to reside in chloroplasts in leaf 
tissues. The activity of a linoleoyl desaturase has 
been demonstrated to occur in thylakoids of pea 
chloroplasts (Grechkin et aL, 1984). However evi- 
dence for more than one site for the formation of 
18:3 in plant cells has been accumulating (Frentzen, 
1986). 
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Though plants accumulate large amounts of lino- 
lenic acid, the activity of linoleoyl desaturase is very 
low and less stable relative to other desaturases. Few 
reports have dealt with the in vitro assay of linoleoyl 
desaturase activity. The activity of linoleoyl desatu- 
rase can be assayed by feeding [^'^Cllinoleic acid to 
intact plants or homogenates. Catalase stimulates 
the activity of this enzyme (Browse and Slack, 1981). 
Differential centrifugation of soybean homogenates 
caused a complete loss of its activity (Stymne and 
Appelquist, 1980), but the enzyme from hnseed 
cotyledons has been partially stabilized and shown 
to be located in microsomes (Browse and Slack, 
1981). Attempts to further, isolate the enzyme have 
not been successful. 

Procaryotic and eucaryotic pathways for the formation 
of polyunsatured fatty acids 

Two different pathways for the formation of lipids 
in higher plants have been proposed (Roughan ei 
aL, 1980; Heinz and Roughan, 1982; Gounaris et aL, 
1986; Heemskerk et aL, 1987). The characteristics of 
the two pathways are originally based on the 
.positionally specific distributions of the fatty acids 
^ between the two positions of the glycerol backbone 
* (HeiiiiZi 1977). In the procaryotic (intraplastidic) 
pathway, galactolipid synthesis begins with the 

- assembly of predominantly 18 carbon fatty acids on 
glycerol-3-P forming lysophosphatidic acid followed 
by incorporation of 16 carbon fatty acids at the sn-2 
position forming 1-18-2-1 6-phosphatidic acid (PA) 
(Sauer and Heise, 1982), The PA is cleaved by a 
phosphatase, yielding diacylglycerol which is sub- 
sequently galactosylated to form MGDG (Frentzen 
et aLy 1982). This group of lipids is exclusively 
estenlied with C16 fatty acids at the 5/i-2 position, 
while the 5n-l position contains Cl8- and to a lesser 
extent C16^acyl groups. Since this distribution cor- 
responds to the typical fatty acid pattern of glycero- 
lipids from cyanobacteria, it is called the procaryotic 
pathway. This pathway is located in chloroplasts 
and uses galactolipid intermediates as precursors 
(Roughan and Slack, 1982). The other system is the 
.eucaryotic pathway which forms glycerolipids having 
C18-fatty adds at both positions. C16-acyl groups 
are excluded from the sn-2 position oflipids formed 
by the eucaryotic pathway (Williams et a/., 1983). 
This eucaryotic pattern is characteristic of glycero- 
lipids from extraplastidic membranes. Subsequent 
^evidence indicates that the eucaryotic pathway occurs 
in the cytosol phase and involves microsomal PC 
a^ the substrate (Frentzen, 1986). 

- /The basis of the two pathway hypothesis is that 
fatty acids synthesized de novo in the chloroplast 
may either be used directly for production "of chloro- 



plast lipids via the procaryotic pathway (Roughan 
et al, 1980; Sparace and Mudd, 1982; Heinz and 
Roughan, 1983), or be exported to enter the euc j 
ryotic pathway at an extrachloroplastic site parii 
cularly in the endoplasmic reticulum (Block ei al 
1983; Dubacq et aL, 1983; Oursel et aL, 1987) The 
diacylglycerol moiety of PC synthesized by ihc 
eucaryotic pathway is returned to the chloroplasi 
probably by the action of a PLTP, where it contri- 
butes to the production of thylakoid lipids (Ohnishi 
and Yamada, 1982; Dubacq et aL, 1984). In the 
eucaryotic pathway, molecular species of PC produ- 
ced in the microsomes, composed mainly of 18:2 and 
18:3 at both the 5^-1 and sn-2 positions, serve as a 
precursor of MGDG synthesis in the chloroplast 
(Norman and St. John, 1986). 

These two different pathways for the synthesis of 
plant lipids have been suggested to be associated 
with the production of polyunsatured fatty acids. 
This theory is based on the accumulating data from 
[^-^Cjacetate, '^CO^, [^H]glycerol and [^^C]oleate 
labelling of leaves and algae cells in vivo (Appleby 
et aL, 1971; Williams and Khan, 1 982); from labelling 
experiments with isolated chloroplasts and micro- 
somal fractions in vitro (Roughan et aL, 1980; 
Dubacq et aL, 1983); and from enzymological studies 
(Joyard and Douce, 1977; Douce and Joyard, 1979; 
Block et al,, 1983; Frentzen et aL, 1983, 1984). In 
the procaryotic pathway, 18:1/16:0 monogalactosyl 
diacylglycerol (MGDG) is synthesized within chlo- 
roplasts and desaturated in situ to form 18:3/16:3 
MGDG (Siebertz et aL, 1980). An eucaryotic pathway 
involving desaturation of microsomal PC provides 
the diacylglycerol (DAG) precursors for 18:3/18:3 
MGDG synthesis (Roughan and Slack, 1984). For 
example, in Arabidopsis thaliana, 18:2/16:2 MGDG 
(procaryotic pathway) is the substrate for produc- 
tion of 18:3 at the snA position of MGDG. The 
desaturation of 18:2 on PC (eucaryotic pathway) 
provides 18:2/18:3 PC as a precursor for 18:3/18:3 
MGDG synthesis (Norman and St. John, 1986). 

According to the positional and fatty acid specifi- 
cities of the glycerophosphate and monoacylglyce- 
rolphosphate acyltransferase, phosphatic acids with 
a procaryotic pattern are formed in the chloroplast 
envelope (Stobart et aL, 1983; Stymne and Stobard, 
1984 a and b, 1985). This phosphatic acid serves as 
the substrate for the subsequent biosynthesis of 
monpgalactosyldiacylglycerol as well as phospha- 
tidylglycerol. The ability to form procaryotic glyce- 
rolipid is decisively controlled by the activity of the 
plastidial jphosphatidic acid phosphatase (Gardiner 
et aL, 1982; Heinz and Roughan, 1983). In vitro 
labelhng experiments with isolated chloroplasts from 
different 16:3 and 18:3 plants indicate that the 
phosphatase activity is highly correlated with the 
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amount of 16:3. In the 16:3-plant spinach, this 
phosphatase is located in the outer membrane of 
the chloroplast (Joyard and Douce, 1977). 

The ER is the primary site of the biosynthesis of 
eucaryotic glyceroHpids in the extraplastidic part of 
. the cell (Moore, 1982; Sauer and Robinson, 1985). 
The activities of the ER are relevant to the biogenesis 
of the cells'. entire membrane system. However, the 
microsomal system from photosynthetically active 
. tissues is less well characterized than that from 
developing seeds. PC has a decisive metabolic role 
not only in the biosynthesis of eucaryotic diacyl- 
glycerolipids in leaves, but also in the biosynthesis 
of polyunsaturated TAGs in developing seeds (Slack 
; et a/., 1978; Roughan and Slack, 1984). The phos- 
phatidic acid formed in the ER membrane serves as 
a precursor for the biosynthesis of the different 
phospholipids which can subsequently be used for 
linoleic and linolenic acid biosynthesis. The desa- 
. turation of acyl groups in the ER is conclusively 
J demonstrated to occur on those esterified to PC 
' (Roughan and Slack, 1 984). This desaturation occurs 
p in membrane fractions not only from developing 
f oilseeds, but also from photosynthetic tissues. 
. iThe relative contribution of these two pathways 
toiMGDG synthesis establishes the fatty acid com- 
1 . position in a given plant species (Roughan, 1985), 
1 For example, if only the eucaryotic pathway is used, 
the plant is classified as an "18:3" species. If the 
I procaryotic pathway contributes substantially to 
^' cellular lipids, the plant is termed a "16:3" species. 
'i A wide variation in the percentage of 1 6:3 in MGDG 
I has been observed among plant species (Jamieson 
and Reid, 1971) and this diversity is considered to 
I reflect the relative activity of enzymes involved in 
j the two separate biosynthetic pathways (Roughan, 
L 1975). / 

k , The glycerolipid biosynthesis capacity in the two 
' • pathwiays. must be well balanced to result in the 
; ^ dififerent levels of procaryotic and eucaryotic glyce- 
j ^. rolipids in the membrane system of chloroplasts of 
: ^ various plants (Douce and Joyard, 1979). In A. 
\ k' ihaliana, almost equal amounts of chloroplast lipids 
: : are synthesized by the procaryotic and eucaryotic 
pathways (Browse et al, 1986 b). The quantities of 
^ I individual lipids produced by the two routes are 
j I very different. Chloroplast phosphatidylglycerol is 
r synthesized via the procaryotic pathway, whereas 
^v: chloroplast PC is a product of the eucaryotic 
: "'pathway. In one A. thaliana mutant (JBl), the 
j ^ synthesis of 18:3 from the procaryotic pathway is 
I deficient, but plants compensate by producing more 
i ? 18:3 from the eucaryotic pathway (Norman and 

St: John. 1986). 
i-: . , The production of glyceroHpids in the two bio- 
r ; synthetic pathways is also modulated by the concen- 



trations of glycerol 3-phosphate (Gardiner et al, 
1982). This regulation probably results from the 
differing affmities of the glycerol 3-phosphate acyl- 
transferase for the acyl acceptor. Results of in vfvo 
and in vitro labelling experiments showed that an 
increased cellular concentration of glycerol 3-phos- 
phate in leaves of 16:3 and 18:3-plants had no effect 
on the total incorporation of acetate into lipids, but 
significantly stimulated the synthesis of procaryotic 
glyceroHpids (Gardiner et al, 1982). 

Synthesis of polyunsaturated fatty acids _ in seed* 
triacylglycerols 

The fatty acid composition of TAGs in oilseeds is 
species and often variety specific (Hilditch and 
Williams, 1964; Downey and McGregor, 1975). The 
relative proportions of the constituent fatty acids 
esterified at the three positions of the glycerol 
molecule also differ considerably. In general, the 
unsaturated CI 8 fatty acids, oleic, linoleic, and 
linolenic acids are major constituents of the TAGs 
of edible oilseed crops. 

PC plays an important metaboHc role during the 
iformation of polyunsaturated triacylglycerols in 
seeds (Wilson et al, 1980). In developing cotyledons, 
, labelled fatty acids accumulate rapidly into PC and 
diacylglycerols, but initially only at a slow rate into 
TAG. During a chase, following pulse-labelling, 
radioactivity is lost largely from the oleic acid of 
this phospholipid and accumulates in the poly- 
unsatured C18 fatty acids, Hnoleate and linoienate 
of triacylglycerols (Dybing and Craig, 1970; Slack 
et al, 1978). In vitro the microsomal desaturase from 
developing cotyledons uses oleoyl and linoleoyl PC 
as substrates to form linoleoyl and Hnolenoyl PC, 
respectively, as products (Stymne and Appelquist, 
1978; Slack et al, 1979; Browse and Slack, 1981). 
Consequently, this phospholipid appears to serve as 
a donor of these fatty acid for TAG formation {fig. 
2). Other phospholipids, in addition to PC, can also 
serve as acyl donors (Wilson et al, 1980). Since 
labelled glycerol moieties as well as acyl moieties 
were transferred from PC to TAG during a chase, 
it is suggested that this phosphoHpid could provide 
both the DAG and the fatty acids from which TAG 
is formed (Slack et al, 1978). 

The microsomal fraction from oilseeds possesses 
all the necessary enzyme activities for de novo 
biosynthesis of TAGs, namely glycerol 3-phosphate 
acyltransferase, monoacylglycerol 3-phosphate acyl- 
transferase, phosphatidic acid phosphatase and dia- 
cylglycerol acyltransferase {fig. 2). PC formed in the 
microsome is used as a substrate for the subsequent 
desaturation of the oleoyl groups esterified at the 
5/1-1' as well as the sn-l position of the glycerol 
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Figure 2. A scheme showing the biosynthetic pathway of polyunsa- 
turated triacylglycerols in developing oilseeds according to the 
.references given in text. C16 is mostly CI 6:0 whereas CI 8 can 
be C18:0, Ci8:l, C18:2, or C18:3. The enzymes catalyzing 
individual reactions are: (/), G3P acyltransferase; (2), 1-acyl- 
glycerol 3-P acyltransferase; (i), PA phosphatase; (4), DAG 
acyltransferase; (5), DAG choline phosphotransferase; ((5) CI 8: 1 
and €18:2 PC desaturases and (7), lyso-PC acyltransferase. 



backbone (Slack et al, 1979; Rochester and Bishop, 
1984; Stobart and Stymne, 1985). The lysophos- 
phatidylcholine acyltransferase in the microsomal 
fraction exclusively attacks the sn-2 position of PC, 
It possesses a high specificity for unsaturated CIS- 
fatty acids and a slight preference for CI 8:1 (Stobart 
et dL, 1983; Stymne et al, 1983). Hence this enzyme 
preserves the eucaryotic fatty add pattern but affects 
an acyl exchange between acyl-CoA and position 2 
of PC by the combined backward and forward 
reactions. The biosynthetic pathway making poly- 
unsaturated fatty acids of TAGs via PC can be 
channeled to TAGs by the reverse reaction of 
cholinephosphotransferase. These combined for- 
ward and backward reactions alter the acyl-CoA 
mixture exported from the plastids, resulting in a 
decrease of C18:l and an increase of C18:2 or C18:3 
groups corresponding to the fatty add sensitivity of 
the acyltransferase and the fatty acid composition 
at position 2 of PC, respectively. The acyl exchange 
coupled to the DAG PC equilibrium gives rise to a 



continued enrichment of the glycerol backbone wiih 
polyunsaturated fatty acids (Griffiths et di, igj<'> 
Stobart and Stymne, 1985; Stymne and SiobJri 
1985). By this pathway the whole DAG moiei> oi- 
PC is incorporated into TAGs (Stymne and Stobar/ 
1984^?). 

In maturing soybean . seeds, the formation of 
linolenateis developmentally regulated. The amount 
of linolenic acid is highest during the very carK 
stages of seed formation with the relative amoum 
decreasing at the later stages of development (Reuhe! 
et al, 1972; Roehm and Privette, 1970; Cherry vi 
al, 1984). Assays of cell-free extracts have demons- 
trated that the homogenates of early stage cotyle- 
dons possess higher and more stable linoleoyi dcsa- 
turase activity than those of later stages (Stymne 
and Appelquist, 1980). 



Manipulation of the synthesis of polyunsaturated 
fatty acids 

Genetic alteration of the synthesis of polyunsaturated 
fatty acids 

Substantial variation occurs among species in the 
level of polyunsaturated fatty acids in seed oil. Some 
spedes, such as sunflower (Helianthus annus) and 
safTlower {Carthamus tinctorius) contain essentially 
no linolenic acid, but have high levels of linoleic acid. 
However others, including soybean {Glycine max) 
rapeseed/canola (Brassica napus and B. campestris) 
and flax (Linum usitatissimum\ all contain significant 
quantities of linolenic acid (Downey, 1987). The 
content and composition of polyunsaturated fatty 
adds in lipids is genetically regulated in plants. The 
genetic control of polyunsaturated fatty acid syn- 
thesis is best studied by isolation and characterization 
of mutants with altered formation of these fatty 
acids. Such mutants have been isolated from various 
plant species by means of physical and chemical 
mutagenesis. Among these are mutants from A. 
thaliana (Browse et al; 1986 a), flax (Green and 
Marshall, 1984; Green, 1986), soybeans (Wilson et 
al, 1981; Hammond and Fehr, 1983; Wilcox, et al, 
1984), and Brassica oilseed crops (Rakow, 1973; 
Robbelen and Nitsch, 1975) which have been studied 
in some detail. 

The A. thaliana fatty acid desaturation mutants 
isolated and characterized by Browse, Somerville 
and coworkers (Browse et al, 1984, 1985, 1986 a; 
Somerville et al, 1987) have particular promise in 
fadlitating the eluddation of the molecular genetic 
controls and functional significance of unsaturated 
fatty acids in plant leaves. Four mutants (designated 
fadA, fadB, fadC and /a^/Z)) isolated by direct 
analysis of fatty acid composition of leaf tissues from . 
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an ethylmethane sulfonate (EMS)-mutagenized popu- 
lation of plants (Browse et ai, 1986 a) have been 
tentatively described in terms of the sites of the 
particular enzymatic lesions (Somerville et aL, 1987). 
The fad A mutants apparently lack the desaturase 
which converts 16:0 to irans-3-hexadecenoic acid. 
The fads mutants accumulate high levels of 16:0 
but are deficient in 16:1, 16:2 and 16:3 fatty acids. 
The wildtype and the mutants fadC and fadD 
contain almost identical C16/C18 ratios. However, 
the fatty acid composition of fadD shows a decrea- 
sed level of both CI 8:3 and CI 6:3 with a correspon- 
ding increase of CI 8:2 and CI 6:2 in comparison to 
the wildtype. The mutant fadC also contains redu- 
(ced levels of polyenoic acids, but in this mutant it 
is the monoenoic acids which show a corresponding 
increase. These results imply that the desaturase 
' activities affected by the mutations in fadC and fadD 
can act on both C16 and CIS acyl groups, and the 

* site for the insertion of a new double bond is 
K determined from the methyl end of the chain. In 
J chlproplasts, the biosynthesis of dienoic and trienoic 

• acids is therefore catalyzed by n-6 and n-3 desaturase 
( activities (Frentzen, 1986). 

'I " The kinetics of in vivo labelling of lipids with 
> [^^C]acetate and quantitative analysis of the fatty 
I \acid composition of individual lipids suggest that 
^/ reduced activity of a glycerolipid n-3 desaturase is 
I : jresponsible for the altered lipid composition of the 
I fadD mutant (Browse et ai, 1986 a and b). The effects 
,? of the miutation are fully expressed when plants are 
|. grown at temperatures above 26°C, but are relatively 
I minor below 18°C, suggesting a temperature sensi- 
t tive enzyme. Both chloroplast (16:3 containing) and 
extrachloroplast (18:3 in extrachloroplast membranes) 
5 i lipids are equally affected by the mutation, indica- 
*; 'r ting that either the desaturase is located both outside 
< : and inside the chloroplast or CI 8:3 formed inside 
i I the chloroplast is re-exported to other cellular sites. 

Studies on the synthesis of unsaturated MGDG 
; ^ molecular species of the fadD mutant suggest that 
[ multiple substrates are involved in the desaturation 
: f of lindleic acid to linolenic acid for the production 
' of imsaturated galactolipids (Norman and St. John, 
1986). The mutation selectively reduces the levels of 
^18:3/16:3 and increases the amount of 18:3/18:3 
; ; despite the overall reduction in 18:3, suggesting that 
f a chloropiastic pathway for desaturation at the sn-1 
I position of MGDG utilizes 18:2/16:2 MGDG as the 
Isiibstrate. This procaryotic pathway is apparently 
|1deficietit in this mutant. The eucaryotic pathway 
^^deiaturating 18:2 to 18:3 at the sn-2 position of PC 

P predominates in the mutant. Genetic characteri- 
^ /zation of the fadD mutation showed that the low 
^ftrienoic fatty acid content is controlled by a single 
i recessiye nuclear gene. There is no change in the 



fatty acid composition of seed and root lipids in this 
mutant (Browse et ai, 1986 ^i), indicating that diffe- 
rent pathways or isozymes may operate in the 
different tissues. 

The nutritional and industrial value of seed sto- 
rage hpids is dependent primarily upon the fatty 
acid composition. Of particular importance is the 
relative proportion of the CI 8 unsaturated fatty 
acids namely oleic, linoleic and linoleic acids (Smouse, 
1979). Oils with high levels of polyunsaturated fatty 
acids particularly linolenic acid are less suitable for 
use as cooking oils due to the poor oxidative stability 
of these fatty acids. Efforts have therefore been 
expended on altering the fatty acid composition of 
seed storage lipids to meet the desired end use of 
the oils [it should be noted however, that 18:3 (an 
omega-3 fatty acid) has recently been implicated as 
playing an important role in human health (Booyens 
and van der Merwe, 1985)]. 

Searching for and using genetic variants of fatty 
acid composition have become common approaches 
to achieving the manipulation of fatty acid compo- 
sition in seed oil crops. For example, flax oil contains 
a high percentage of 18:3 fatty acid (45%-65%). This 
high level precludes its use as an edible oil and gives 
its traditional industrial use. Genotypes with 2% 
18:3 have been isolated (Green, 1986). This alteration 
is achieved by selection within the F2 generation of 
a cross between two induced mutants with reduced 
levels of linolenic acid (28% -30%) (Green and 
Marshall, 1984). This near elimination of linolenic 
acid from the seed hpids is accompanied by a 
comparative increase in the content of linoleate, with 
the proportions of other fatty acids remaining 
unaltered. These results indicate that the mutations 
block the final desaturation of linoleic to linolenic 
acid. Genetic analysis of crosses among these mutants 
and their parental cultivar revealed that these muta- 
tions are in different unlinked genes and exhibit 
additive (codominant) gene action. Two genetic loci 
with additive effects have therefore been identified 
to control the Hnolenate content (Green, 1986). 

Soybean seed oil is the most common edible oil 
in the world (Smith, 1981). About eight percent is 
composed of 18:3 and this relatively high content of 
linolenic acid as well as linoleic acid has been 
considered to be an important factor lowering the 
autooxidative and flavor stability of soybean oil. 
Accessions of the commercial soybean species {Gly- 
cine max,) display 18:3 content of 4-15% of the seed 
oil. In other species of the genus Glycine, the 
hnolenate content ranged from 11.3-27.2% (Smith, 
1981; Chaven et al, 1982). The lack of genotypes 
with very low 18:3 content within the genus Glycine 
has spurred other approaches to reduce the 18:3 
content in soybean seed oil. 
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Thus far, the most effective approach to develop 
soybean strains with genetically determined low 
levels of linolenic acid has been the use of chemical 
mutagens. Treatment with EMS significantly increa- 
sed the variation in the fatty acid composition in' 
the soybean cultivar ''Century". A genetically stable 
mutant designated C1640 with 3.4% 18:3 was iden- 
tified (Wilcox et al, 1984). This mutation is control- 
led by a single gene locus (Wilcox and Gavins, 1985) 
designated Fan (Wilcox and Gavins, 1987). Treat- 
ment of one low linolenic acid breeding line after 
recurrent selection with EMS resulted in a line, 
designated A5, with linolenic acid content of 2.9- 
4.1% (Hammond and Fehr, 1983, 1984). Graef al. 
(1988) demonstrated that fatty acid composition 
should be considered a quantitative character in 
crosses which involve A5 as a parent. The decreased 
18:3 content in A5 is apparently the result of a 
decreased rate of 18:1 desaturation of oleoyl-PG in 
this genotype. Interestingly we find that both G1640 
and A5 have reduced 18:3 in root lipids but not in 
any other vegetative tissues (Wang et al., 1988). A5 
had a corresponding increase in 18:2 in the root 
lipids in contrast to the increase in 18:1 in the seed 
lipids (Wang et aL, 1988). 

Recurrent selection for high oleic acid and a high 
ratio- of 18:1 to 18:2 + 18:3 has been conducted to 
a:lter the fatty acid composition of soybean oil 
(Wilson a/., 1981). The results of this study showed 
that the 18:1 content of the oil increased linearly 
from 24.8 to 33.0%. The inversely correlated trait, 
18:3, was reduced linearly from 7.8 to 6.3%. 

An objective of rapeseed (8-10% 18:3) breeding 
has been to reduce the level of linolenic acid to less 
than 3% while maintaining or increasing the level 
; of the nutritionally desirable linoleic acid presently 
at 20-25%. [It should be noted, however, that the 
major objective of the genetic improvement of 
rapeseed oil has been to reduce the erucic acid 
content (22:1). Breeding efforts toward these goals 
have been highly successful resulting in the deve- 
lopment, for instance, of the new Ganbla type 
rapeseed oil with a reduction of erucic acid from 
about 50% to less than 1% and a corresponding 
increase in oleic acid (18:1) (Downey, 1987). The 
canola oils are acceptable for many edible purposes 
even though they have an 18:3 content averaging 
10% because of the high 18:1/18:2 + 18:3 ratio.] 
Ghemical mutagenesis was successful in reducing 
the linolenic acid level to 5.5% (Rakow, 1973) and 
subsequent selection produced material with a lino- 
lenate level as low as 3.2% (Robbelen and Nitch, 
1975). In the Brassica species, linolenic acid biosyn- 
thesis was observed to occur only in those seeds 
which possess green photosynthetically active chlo 



roplasts during certain stages of their development 
(Thies, 1970). 

Studies of the metabolism of MGDG molecular 
species in A. thaliana leaf mutants have revealed 
different sites and substrates for linolenic acid syn- 
thesis (Norman and St. John, 1986). The use of other 
mutants with low seed 18:3 for elucidating the 
mechanism of regulation and synthesis of linolenate 
is however still at preliminary stages. Despite great 
efforts having been expended on the selection of low 
18:3 mutants in soybeans, no 18:3-null mutants have 
been found. It is not yet clear if a certain arhount 
of 18:3 is essential for soybean seed development. 
However, the near absence of 18:3 in other oilseed 
crops diminishes this possibiHty. Inconsistent expres- 
sion of the low 18:3 trait in various tissues has been 
observed, in soybean seed mutants (Martin and 
Rinne, 1985). The reduction of Hnolenate content in 
some mutants is due to the blockage of desaturation 
at 18:1, whereas others occur at the 18:2 desaturation 
step (Gherry et a/., 1984). Nevertheless, these mutants 
. have provided an instrumental approach to the 
study of 18:3 biosynthesis and regulation. 

Chemical modulation of the formation of polyunsa- 
turared fatty acids 

One substituted pyridazinone, San 9785 (4-chloro- 
5-dimethylamino-2-phenyl-3(2H) pyridazinone), is a 
potent inhibitor of the desaturation of linoleate to 
linolenate (St. John, 1976; Lem and Williams, 1981). 
San 9785 has been shown to selectively affect the 
levels of linolenate in several species of higher plants 
without causing any gross change in leaf develop- 
ment and chloroplast content (Laskay et al, 1983). 
Labelling studies with fatty acid precursors, such as 
**C02 and [^^G]acetate, have demonstrated a 
reduction in linolenate radiolabelling in the presence 
of San 9785 and a concomittant increase in the levels 
of [^"^GJlinoleate (Willemot et al, 1982). Therefore, 
San 9785 has been considered to have a direct effect 
upon the conversion of linoleate to linolenate. This 
compound has been widely used in the manipulation 
and study of the synthesis and function of linolenic 
acid. 

It is suggested that San 9785 inhibits 18:3 forma- 
tion at the procaryotic pathway with little effect on 
the eucaryotic one (Lem and Williams, 1981; Norman 
and St. John, 1987). San 9785 was shown to reduce 
linoleate desaturation of MGDG but not PC. The 
differential effects of San 9785 on the pathway of 
MGDG synthesis was studied in A. thaliana (Norman 
and St. John, 1987). 18:3/16:3 MGDG was decreased 
by San 9785, and 18:2/16:3 and 18:2/16:2 MGDG 
concurrently increased. Kinetic studies using exo- 
genously incorporated [^'^C]18:l indicated that 



Plant Physiol. Biochem. 



Biosynthesis of linolenate 785 



18:3/18:3 MGDG originated from a 18:2/18:3 dia- 
cylglycerol precursor derived from PC. The forma- 
tion of 18:3 at the sn-2 position of PC was less 
sensitive to San 9785 than desaturation of 18:2 at 
the sn-1 position of 18:2/18:3 MGDG which is 
proposed to occur within the chloroplasts. 

Direct evidence has shown that the site of action 
of San 9785 on fatty acid biosynthesis in higher 
plants is at the level of linoleic acid desaturation, 
but there are large variations in sensitivity between 
plant species (Hilton fl/., 1971; Murphy eta/., 1980, 
1985). Effects of San 9785 have also been reported 
upon photosynthetic oxygen evolution (Khan et ai, 
1979 a; Lem and Williams, 1981), thylakoid ultra- 
structure and chlorophyll-proteins (Khan et al, 
. 1979^; Davies and Harwood, 1983; Leech and 
. Walton, 1983). It has been reported that, whereas 
- San 9785 inhibits the incorporation of [^*C]acetate 
< into linolenate in spinach leaf discs, it had no effect 
upon this incorporation in either isolated chloro- 
plasts or whole leaves of spinach (Willemot et ai, 
. 1982). The effects of San 9785 upon photosynthetic 
% oxygen evolution that were reported from both Vicia 
faba leaf discs and isolated spinach chloroplasts 
^ were not found in the case of whole leaves of barley. 
> It therefore appears that San 9785 may be quite 
variable in its effects upon plants depending oh the 
t specie^ studied and the pretreatment of the tissues. 
^ Uptake studies demonstrated that the uptake of 
San 9785 was a reflection of water uptake (Murphy 
I . et a/., 1985). Following its uptake, San 9785 was 
; rapidly converted into other compounds in pea, but 
^ only gradually metabolized in cucumber and rye- 
J grass. The differential sensitivity of higher plants to 
f San 9785 was shown to be due to variation both in 
I uptake and in metabolism. 

k-- San 9785 reduces the 18:3 content in soybean 
t cotyledons developing in vitro (St. John et ai^ 1984; 
I . Wang et ai, 1987 a). It also decreases the activity of 
I lipoxygenase, an enzyme catalyzing the. oxidation of 
I polyunsaturated fatty acids, in peanut and soybean 
fr seeds (Ory et ai, 1981, 1984; St. John et al, 1984; 
I Wang and Hildebrand, 1987). Treatment with 
f San 9785 causes these changes without affecting the 
yield and other important agronomic parameters. 
V Thus, it is suggested that this compound could be 
V. applied to the improvement of soybean quality. 
p - . The mode of action of San 9785 on the inhibition 
|> of linoleate desaturation is not currently understood. 
t The observation that San 9785 had little effect on 
E ' linolenate synthesis in isolated chloroplasts (Willemot 
I et ai, 1982) suggests that either protein synthesis is 
I required for San 9785 action or it needs to be 
K metabolized first in the cytosol in order to be 
I functional. In addition, the decrease of radioactive 
|v labelling of linolenate in treated tissues is often seen 

' ' . . 

V- ■ ■ 



after prolonged incubation with San 9785 (Lem and 
Williams, 1981; Davies and Harwood, 1983). This 
delay may be due do the slow uptake of San 9785; 
to a delay in the conversion to active metabolites 
(St. John and Hilton, 1976); to an inhibition of the 
synthesis of linoleate desaturase; and/or to an eleva- 
tion of degradation of the desaturase by this com- 
pound. Furthermore, the inhibition of linoleate 
desaturation by San 9785 does not occur in the 
presence of cycloheximide (Norman, personal com- 
munication), a cytosol protein synthesis inhibitor, 
indicating that protein synthesis is required for 
San 9785 function. 

Environmental effects on the synthesis of polyunsa- 
turated fatty acids 

The synthesis of linolenic acid in plants can be 
affected by a number of environmental factors which 
include temperature, light, water stress and salt 
stress (Harwood, 1984; Tremolieres, 1985). Low 
temperature often stimulates the synthesis of poly- 
unsaturated fatty acids in various plant tissues 
(Hazel and Prosser, 1974). Plants grown at low 
tenaperatures during seed maturation accumulate 
more 18:3 in TAGs than those grown at high 
temperatures (Reubel et al., 1972; Hawkins et a/., 
1983 a and b; Cherry et al, 1985). A change in the 
ambient temperature caused a marked alteration 
over a 24 h period in the proportions of unsaturated 
CI 8 fatty acids in PC and DAG during soybean 
and linseed cotyledon development (Slack and 
Roughan, 1978). At high temperatures, 18:1 increa- 
sed and 18:2 and 18:3 decreased. For soybean 
cultivars with different levels of linolenate grown in 
Northern areas (low temperature) and Southern 
areas (high temperature), seeds produced in the 
North are significantly higher in myristate and 
linolenate, but are lower in oleate (Cherry et al, 
1985). Tremolieres et al (1978, 1982) found that in 
rapeseed low temperatures increased the level of 
polyunsaturated fatty acids at the expense of oleic 
acid biosynthesis without change of the total lipid 
content. Similar effects of temperature -on polyun- 
saturated fatty acid synthesis have also been obser- 
ved in plant cells in culture (Tremolieres et al, 1978; 
MacCarthy and Stumpf, 1980; Tremolieres et al, 
1982). However, in developing sunflower seeds 
(Tremolieres et al, 1982) low temperatures decreased 
lipid accumulation with little change in fatty acid 
composition. Linolenic acid biosynthesis in Pharbi- 
tis nil cotyledons was likewise very slow at lower 
temperatures (17C) compared to higher temperature 
(27C). 

There is some controversy concerning the reason 
for the elevated Hpid desaturation in plants grown 
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at low temperatures. Three schools of thought exist. 
Some believe that low growth temperature results 
in an alteration of the activity or amounts of the 
desaturase enzymes themselves and that such changes 
are adaptive, enabling cellular* membranes to func-/ 
tion more effectively (Thompson, 1980). Others 
propose that the low growth temperature increases 
oxygen solubihty, therefore providing more sub- 
strate for the existing desaturase enzymes (Rebeille 
et al, 1980). Still others (Browse and Slack, 1983) 
found evidence which suggests that the apparent 
increased lipid desaturation at lower temperature in 
developing safflower cotyledons is actually the con- 
sequence of greater increases in fatty acid synthesis 
than oleate desaturation at higher temperatures 
which therefore decreases the ratio of polyunsatu- 
rated to monounsaturated fatty acids at higher 
temperatures. 

Light also has a profound impact on linolenic 
acid synthesis. Linolenic acid, esterified to phospho- 
lipids or galactolipids, is the principal component 
of the lipid matrix of photosynthetic membranes of 
chloroplasts (James and Nichols, 1966), and it can 
account for up to 90% of the total fatty acids in 
that organelle (Leech and Murphy, 1976; Tremolieres, 
1985). The easiest and most widely used experimen- 
tal system is the greening of etiolated tissues. Etib- 
plasts contain much lower levels of linolenate than 
chloroplasts (Tremolieres and Lepage, 1971; Nichols, 
1965; Tremolieres and Mazliak, 1970). Darkgrown 
pea seedlings are rich in linoleic acid. After illumina- 
tion of these seedlings, a very significant increase in 
linolenic acid is observed in the young leaf sections, 
whereas only small variations are seen in fatty acid 
composition of other sections (Tremoheres and 
Lepage, 1971). Studies have also shown that photo- 
autotrophic cells in culture produce much more 
linolenic acid than heterotrophic cells (Husemann 
et aU 1980). 

It was found that greening cucumber cotyledons 
exhibited a dramatic increase in the ability to 
desaturate exogenously added [^"^C^hnoleic acid 
(Murphy and Stumpf, 1979). The inhibition of the 
light-dependent increase in desaturating activity by 
cycloheximide suggests that this process is depen- 
dent on protein synthesis on the 80S ribosomes (i.e. 
cytoplasmic), which parallels similar findings in 
other light-induced systems. However, oleate and 
linoleate desaturation in leaves of maize seedlings 
was largely independent of the previous light treat- 
ment of the seedlings (Hawke and Stumpf, 1980); 
there was no evidence of light-induced desaturase 
activities. These results are in sharp contrast to those 
observed with developing cucumber cotyledons. In 
vivo desaturase activity was present in tissues of 
widely different levels of differentiation and chloro- 



phyll content obtained from light grown maixc 
seedlings. 

Water stress on plants likewise results in a chanec 
of linolenic acid synthesis (Pham-Thi et al, 1982) 
Drought appears to reduce the ability of plants lo 
synthesize 18:3. Studies have shown a decrease in 
this polyunsaturated fatty acid in cotton leaves sub- 
mitted to water stress by withholding irrigation. 
Experiments on incorporation with [^"^CJacetaie as 
the precursor clearly indicate that water deficits 
provoke a severe inhibition of unsaturated fatty acid 
biosynthesis. The inhibition of oleate and linoleate 
desaturation by drought certainly contribute to the 
decrease in the content of leaf polyunsaturated fatty 
acids observed in water-stressed cotton leaves 
(Ouedraogo et ai, 1984; Pham-Thi.et al, 1985, 1987). 

The effect of salt stress such as sodium chloride 
on lipids is expressed mainly by a decrease of the 
linolenic acid content (Zarrouk and Cherif, 1984). 
It seems that a variety of environmental factors can 
affect polyunsaturated fatty acid synthesis, either 
directly or indirectly. However, the molecular mecha- 
nisms of such changes are obscure, and it is not yet 
known what role these changes in lipids may play 
in the adaptation of plants to such environmental 
stresses. 

Conclusion 

This review on the current understanding of 
linolenate production indicates that the process of 
the biosynthesis of linolenic acid is complex and 
many fundamental questions remain to be answered. 
In particular, further studies are needed to establish 
the substrates for 18:2 desaturation, existence of 
multiple distinct desaturases, subcellular sites of the 
desaturation, enzymatic and molecular genetic regu- 
lation, and the coordination of different pathways 
for linolenic acid formation. 

Regarding the manipulation of linolenate con- 
tent in oilseed crops, conventional oilseed breeding 
has been primarily concerned with the genetic 
expression of fatty acid composition. But the more 
that is known of fatty acid and lipid synthesis, the 
more effectively the breeder can assess the opportu- 
nities for oil quality improvement and design the 
appropriate breeding strategies. Moreover, the rapid 
development of biotechnology opens additional 
avenues for the genetic engineering of fatty acid 
composition of oilseed crops (Knauf, 1987). The 
scope of possibilities of manipulating storage fatty 
acids and lipids using genetic iengineering is indica- 
ted by the diversity of different fatty acid coniposi- 
tions which exist in various types of oilseeds. Plant 
breeders have already shown that the composition 
of polyunsaturated fatty acids of two cultivars of an 
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I oilseed crop can differ considerably without obvious 

effects on agronomic suitability. The modification 
\ of seed storage fatty acid composition appears tO' 
J; be feasible, but applied research and product deve- 
j lopment will depend on gaining a better understan- 
£ ding of the biochemical and genetic processes of 
< lipid biosynthesis. Requisite research for the effective 
Li: application of bio technological approaches to the 
^ manipulation of lipid composition will be the identi- 
ty flcation and characterization of the key. enzymes 
I controlling lipid composition. 
^ In soybeans, one goal of lipid alteration is to 
S decrease the Hnolenic acid content in order to 
^•increase the stability of soybean oil for cooking 
^ purposes. The complexity of linolenic acid biosyn- 

thesis within plant tissues and within different parts 
?y of a cell present difficult challenges to current 
hvbiotechniques. In addition, a prerequisite in a genetic 
e|^:en^neering project is the ability to monitor the gene 

of interest and its product. Therefore, by using 
^i^-various approaches, studies in several laboratories 
1^ are currently endeavoring to identify the gene pro- 

ducts involved in the control of 18:3 content 
pSomerville et al, 1987; Wang et a/., 1987 a and b). 
^Further research will be aimed at isolating those 
l^i genes. The fact that a number of studies have shown 

that single gene changes can have a large impact on 
1 18:3 content (Wilcox and Gavins, 1985; Green, 1986; 
p:Somerville et ai, 1987) is encouraging to the pros- 
|>pects of genetic engineering of plants for altered lipid 
li^composition. However, progress along these lines 
]^ will be complicated unless effective protocols are 
^ developed for the purification of the key enzymes 
r:(such as linoleate desaturases) controlUng linolenate 

levels in plant tissues. 
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[34] Stearoyl-Acyl Carrier Protein Desaturase 
from Safflower Seeds 

By Tom McKeon and Paul K. Stumpf 

Stearoyl-ACP (acyl carrier protein) desaturase is the enzyme respon- 
sible for the synthesis of oleic acid in plants. Nagai and Bloch, who first 
characterized the activity, found that the enzyme requires stearoyl-ACP, 
reduced ferredoxin, and molecular oxygen. 

Stearoyl-ACP + ferredoxin(II) + O2 + 2H+ ^ 

oleoyl-ACP + ferredoxin(III) + 2H2O 

The stearoyl-ACP desaturase is easily extracted into buffer without 
the use of detergents, has no requirement for added lipid, and has a lipid- 
insoluble substrate/ all in marked contrast to the stearoyl-CoA de- 
saturase of animal systems.^ However, because the plant and animal de- 
saturases both require oxygen and an electron transfer system to carry out 
the same chemical reaction, it is thought that the mechanism of the reac- 
tion may be the same for both types of enzyme. 

Nagai and Bloch found the stearoyl-ACP desaturase in photosynthetic 
tissue — Euglena gracilis and spinach chloroplasts.^-^ Subsequently, 
Jaworski and Stumpf characterized the activity in immature saflBower 
(Carthamus tinctorius) seed,^ a nonphotosynthetic tissue. The activity is 
also present in avocado mesocarp,^ immature soybean cotyledons,® imma- 
ture jojoba nuts/ and immature coconut.® However, this report describes 
only the stearoyl-ACP desaturase from saflBower. 
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Assay Method 

Principle. The assay for stearoyl-ACP desaturase is based on the mea- 
surement of ''*C-labeled oleic acid produced by desaturation of '''C-labeled 
stearoyl-ACP. Separation and quantitation of the '"C-labeled fatty acids 
are carried out by thin-layer chromatography and scintillation counting or 
by gas-liquid chromatography and radioactive counting in a proportional 
counter. 

Reagents 
PIPES, 0.10 M, pH 6.0 

NADPH, 25 mM, freshly prepared in 0.1 M Tricine, pH 8.2 
Bovine serum albumin (BSA), lipid free, 10 mg/ml in water 
y Dithiothreitol (DTT), O.IOM, freshly prepared 
Ferredoxin, 2 mg/ml (Sigma, spinach, type III) 
NADPH: ferredoxin oxido reductase (Sigma), 2.5 units/ml 
Catalase (Sigma, bovine liver, 800,000 units/ml) 
[^^C]Stearoyl-ACP, 10 fxM in 0.1 M PIPES, pH 5.8; its synthesis is 

described after the assay procedure 
NaOH, 8 M 
H2SO4, 4 M 

Stearic acid and oleic acid, 1 mg/ml each in acetone 
Petroleum ether 

Diazomethane (20 mg/ml in diethyl ether) 

AgNOg-silica gel G thin-layer plates, 0.25 mm thick (Redi-Coat AG, 
Supelco) 

2,7-Dichlorofluorescein, 0.1% in methanol 
Procedure. The following reagents are added for each assay: water, 150 
/Ltl; DTT 5 /Lil; BSA, 10 fx\; NADPH, 15 /u.1; ferredoxin, 25 yutl; 
NADPH : ferredoxin oxidoreductase, 3 /Ltl; and catalase, 1 /u.1. This mix- 
ture is kept at room temperatuce for 10 min and is then added to a 
13 X 100 mm screw-cap test tube containing 250 /il of PIPES buflPer. The 
stearoyl-ACP desaturase preparation is added in a volume of 10 /xl, and the 
reaction is started by adding 30 /u.1 of stearoyl-ACP and incubating at 23° 
with shaking for 10 min. The reaction is stopped by adding 125 /u.1 of 8 
M NaOH and 0.1 ml of the fatty acid solution. The tubes are capped and 
incubated for 1 hr at 80°. The mixture is acidified with 160 /u,l of AM H2SO4 
and vigorously extracted three times with 2-ml portions of petroleum 
ether. The extract is evaporated under nitrogen, methylated with 0.5 ml of 
diazomethane solution for 30 min on ice, and then evaporated to dryness. 
The methyl esters of stearate and oleate are then separated and quanti- 
tated by either of two methods: thin-layer chromatography on AgNOy- 
silica gel plates as described by Holloway^ or gas chromatography (10% 
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DEGS-PS on Supelcoport 80/10; 6 ft. x i in. column at 180°) followed 
by counting of the radioactivity in a gas proportional counter. Radio-gas 
chromatography avoids the slight complication of correcting for the 
[i^Cjpalmitate contaminant present in most stearoyl-ACP preparations, 
but, for accuracy and sensitivity, thin-layer chromatography is the 
method of choice. One unit of activity is defined as 1 fimo\ of oleate 
produced per milligram of protein per minute. 

An alternative method for reducing ferredoxin uses a chloroplast grana 
suspension, ascorbic acid, 2,6-dichlorophenolindophenol, and light. This 
system has been described in detail by Jaworski and Stumpf.'* 



Stearoyl-ACP Synthesis 

Procedure. Stearoyl-ACP is made with a safflower fatty acid synthase 
system, [>^C]malonyl-CoA, and Escherichia coli ACP. The method de- 
scribed herein differs only slightly from that described by Jaworski and 
Stumpf.» 

Immature safflower seeds are suspended in an equal volume of 0.10 M 
potassium phosphate, 5 mM sodium ascorbate, pH 6.8, and are 
homogenized with a Polytron instrument for three half-minute periods at 
half speed, centrifuged at 12,000 g for 20 min, and filtered through four 
layers of cheesecloth and one layer of Miracloth. The safflower supernat- 
ant is used as a source of fatty acid synthase with no further purification. It 
is stable when frozen for 6 weeks.* 

The incubation medium contains the following components in a total 
volume of 5 ml: water 2.2 ml; 25 mM NADH, 100 /xl; 25 mM NADPH, 100 
/Ltl; l.OM Tricine (K+), pH 7.9,^50 /al; O.IOM DTT 25 /xl; 200 mM MgClz, 
25 Ml; 10 mM malonyl-Co.^, 1.0 ml; [l,3-»^C]malonyl-CoA (50-60 
mCi/mmol), 10 ^Ci in 500 /xl; and ACP (4 mg/ml), 175 /otl. The ACP used is 
purified from£. coli by the method of Majerus et al. to 90% purity, and is 
reduced with 1 mM DTT for 15, min just prior to use. The reaction mixture 
is carefully bubbled with nitrogen for 5 min; 625 fxX of safflower superna- 
tant are added, then the mixture is again bubbled with nitrogen for a 
minute, stoppered, and placed in a 23° water bath. The reaction is stopped 
after 45 min by the addition of 0.55 ml of 50% trichloroacetic acid (TCA) 
in the hood and bubbled with nitrogen to displace ^''COz; it is held on ice 
for 30 min and centrifuged at 5000 g for 5 min. The pellet is redissolved in 
2.5 ml of 0.10 M PIPES, pH 5.8, titrating with 1 M KOH if necessary; 
debris is removed by centrifugation, and solid ammonium sulfate is added 



» J. G. Jaworski and P. K. Stumpf, Arc/t. Biochem. Biophys. 162, 166 (1974). 
»o P. W. Majerus, A. W. Alberts, and P. R. Vagelos, this series. Vol. 14 [6]. 
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to 70% saturation (0°). The precipitate is centrifuged at 12,000 g for 10 
min, and the supernatant is acidified with 50% TCA to 10%. The TCA 
precipitate is dissolved in 1 ml of PIPES buffer as before; insoluble mate- 
rial is removed by centrifugation, and the concentration of the stearoyl- 
ACP is adjusted to 10 /xM. This preparation provides 25-40% of the theo- 
retical yield of ^^C in acyl-ACP. The product, as analyzed by radio-gas 
chromatography and AgNOs-silica gel TLC, contains 80-90% 
stearoyl-ACP, 10-20% palmitoyl-ACP, and less than 0.5% oleoyl-ACP. 
Frozen solutions of acyl-ACP are stable for over 2 months. 

An alternative method for making acyl-ACP of specific chain length is 
the acyl-ACP synthetase reaction described by Spencer et al.'' Since a 
specific fatty acid may be ligated to the ACP with this sytem, it has been 
used to make the substrates employed in specificity studies. However, 
this system does not efficiently ligate stearic acid to ACP (2-4%) in our 
hands; therefore, the fatty acid synthase reaction is routinely used to 
produce substrate for desaturase assays. Another method for makmg 
acyl-ACP is described in this volume [21]. 



Purification 
Materials 

Immature saflflower seed, Gila variety, harvested at approximately 
14-18 days after flowering, as indicated by a charcoal gray seed 
coat 

Acetone, reagent grade, -20° 
Diethyl ether, anhydrous 

DEAE-cellulose, equilibrated with 0.02 M potassium phosphate, pH 

ACP-Sepharose 4B, 2 mg of ACP per milliliter of wet gel; the column 
material was made with purified E. coli ACP, and cyanogen 
bromide activated Sepharose 4B by the method of March et al. 
The reaction was carried out at pH 6.5 in 0.1 M NaHCOg for 1 day 
at 4°, and 70% of the ACP was covalently bound to the Sepharose 

Potassium phosphate buffers, 0.02 M, 0. 10 M, and 0.30 M, all pH 6.8, 
sterilized and degassed 



Procedure 

Acetone Powder. Immature safflower seeds (stored at -20°) are ground 
with an equal volume of acetone at high speed in a blender. The suspen- 

»» A. K. Spencer, A. D. Greenspan, and J. E. Cronan, Jr., FEES Lett. 101, 253 (1979). 
»2 S. C. March, I. Parikh, and P. Cuatrecasas, Anal. Biochem. 60, 149 (1974). 
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sion is suction-filtered, and the retained material is repeatedly extracted 
with acetone as above until the filtrate is clear and colorless. After the 
third extraction, the acetone suspension is passed through a coarse sieve 
to remove fragments of seed coat. Generally, five extractions are required 
to remove the lipids and phenolic s. After the final filtration, the retained 
material is rinsed several times with a small volume of ether at - 20° to 
remove acetone, and then is kept under suction or in a vacuum desiccator 
to remove the last trace of ether. Stearoyl-ACP desaturase is stable in 
frozen seeds for at least 2 years and in the acetone powder for at least 3 
months. 

The following steps are carried out at 0-4°. 

Acetone Powder Extract. Acetone powder from a given weight of seed 
is triturated with twice that weight of 0.02 M phosphate buffer and gently 
agitated for 1 hr. The suspension is then centrifuged at 12,000 g for 20 min 
and filtered through Miracloth, the supernatant, which contains the de- 
saturase, is immediately applied to DEAE-cellulose or frozen. The activ- 
ity in this preparation is stable for 3-4 weeks at -20" or for 1 week at 4°. 

DEAE-Cellulose Pass-through. Acetone powder extract is passed 
through a column of DEAE-cellulose (1 ml bed volume/3 ml extract), and 
the column is washed with one bed volume of 0.02 M phosphate buffer. 
The pass-through and effluent from the wash are collected. While this step 
does afford some purification (see the table), its principal purpose is to 
eliminate an acyl-ACP thioesterase present in the extract. Approximately 
80% of the thioesterase is thus eliminated. « 

ACP-Sepharose 4B column. The capacity of the ACP-Sepharose is 5 ml 
of DEAE-cellulose pass-through per milliliter of column material. De- 
creasing this ratio does not improve the percentage yield, and increasing 
the ratio decreases the percentage yield. 

A column with a 20-ml bed volume is loaded at a flow rate of 0.5 ml per 

'% 

Purification of Stearoyl-ACyl Carrier Protein (ACP) Desaturase 



Step 


Total 
protein" 
(mg) 


Total 
..activity 
(mU) 


Specific 
activity 
(mU/mg) 


Yield 

(%) 


Purification 
factor 


Acetone powder 


380 


205 


0.55 






extract 










1.7 


DEAE-cellulose 


170 


162 


0.95 


79 


pass-through 








19 


200 


ACP-Sepharose 4-B 


0.34 


38 


110 



« Protein was determined by the method of O. H. Lowry, N. J. Rosebrough, A. L. Farr, 
and R. J. Randall, 7. BioL Chem. 193, 265 (1951), using bovine serum albumin as a 



standard. 
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minute, washed with two bed volumes of 0.02 M buflfer and three bed 
volumes of 0. 10 M buffer. The stearoyl-ACP desaturase is eluted with 0.30 
M buffer and collected in fractions of 1.5 ml. The early fractions contain 
most of the contaminating acyl-ACP thioesterase; the most purified frac- 
tions of desaturase contain acyl-ACP thioesterase as 5-10% of the bulk 
protein.® The desaturase activity from this preparation is stable for 1 week 
at 4^ 

Purity. The most purified preparations of stearoyl-ACP desaturase dis- 
play one prominent band and several minor bands on SDS-gel 
electrophoresis. By comparing samples containing various amounts of de- 
saturase and thioesterase, it appears that the prominent band corresponds 
to the stearoyl-ACP desaturase. 

Properties 

Specificity. At substrate concentrations of 0.3 /xM, the stearoyl-ACP 
desaturase is 40 times more active on stearoyl-ACP than on stearoyl-CoA 
and 80 times more active than on palmitoyl-ACR^ This high specificity for 
stearoyl-ACP contrasts with the promiscuous activity of the analogous 
stearoyl-CoA desaturase from animal systems, which is quite active on 
acyl-CoA containing 13-19 carbon atoms in the acyl chain. 

pH Activity Profile. The desaturase is half-maximally active at pH 5.5 
and pH 8.5, with the maximum activity at pH 5.5 in acetate buffer. How- 
ever, activity at a given pH is dependent on the type of buffer, even at 
constant ionic strength.® 

Stability. The stearoyl-ACP desaturase appears to be fairly unstable. It 
is sensitive to pH, losing 50% or more activity irreversibly when incu- 
bated at a pH outside the range pH 6.0 to pH 7.5. It is inactivated on 
heating at 50"" for 1 min. It is unstable to dialysis, irreversibly losing 50% 
to 100% activity. Finally, further attempts to purify or concentrate the 
eluent from the ACP-Sepharosfe column result in nearly total loss of activ- 
ity. « 5 

Miscellaneous Properties. 'Whe concentration of oxygen required for 
maximum activity is 320 /jlM, which is slightly higher than the oxygen 
concentration in air-saturated incubation medium, namely 280 /jM at 23°; 
half-maximum activity occurs at a concentration of 60 fjM. ® 

Catalase is not required for the desaturase reaction to occur; however, 
it does stimulate the reaction fivefold. Presumably, catalase protects the 
desaturase system by scavenging H2O2. Both the desaturase and the 
ferredoxin, NADPH : ferredoxin oxidoreductase system are partially inac- 

*3 H. G. Enoch, A. Catala, and P. Strittmatter, J. Biol. Chem, 251, 5095 (1976). 
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tivated by 0.1 mM H2O2, and cataiase partially reverses this inactivation.^ 
However, two other enzymatic H2O2 scavengers do not. Neither horse- 
radish peroxidase nor glutathione peroxidase can replace cataiase, and 
horseradish peroxidase inhibits the desaturation reaction.^ 



[35] Acyl Chain Elongation in Developing Oilseeds 
By Michael R. Pollard 

The lipids of most plant tissues contain a narrow spectrum of fatty 
acids: palmitate, oleate, linoleate, and a-linolenate. The neutral lipids 
(triacylglycerols)^ of oilseeds, however, contain a diverse range of fatty 
acids. 2 One structural variation found is that of acids with chain lengths 
greater than the usual 16 or 18 carbon atoms. This chapter describes 
approaches to studying the biosynthesis of acids of chain length C20 or 
greater in developing oilseeds. Some of the considerations noted for the 
investigation of acyl chain elongation are valid for the investigation of 
other types of acyl metabolism found in developing oilseeds. Ideally both 
in vivo andm vitro experiments are required to demonstrate chain elonga- 
tion. A radio-gas chromatography machine is useful for detection of ^re- 
labeled fatty ester. 

Supply of Maturing Seed Tissue 

The choice of a suitable plant will greatly facilitate the investigation. 
An ideal plant will exhibit the following features. 

1. It should be able to produce a steady supply of developing seeds. 
That is, a plant is preferred that can be grown and induced to flower 
all year round, probably in the controlled environment of a growth 
chamber or greenhouse. A short growth period and early flowering 
will give maximum experimental flexibility, 

2. Larger seeds will help reduce the considerable labor of hand polli- 
nation, picking, and removal of the seed coat or pod. 

3. For studies on chain elongation, a seed is required that has a high 
percentage of its fatty acids with a chain length of C20 or greater; 
More than 10% of the dry weight of the mature seed should be hpid 

' The single exception found in higher plants is the wax esters of jojoba {Simmondsia 
chinensis) seeds [T. W. Miwa, 7. Am. Oil Chem. Soc. 48, 259 (1971)]. 

2 C. Hitchcock and B. W. Nichols, ''Plant Lipid Biochemistry,'' Chapter 1. Academic 
Press, New York, 1971. 
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in order to measure accurately lipogenic activities in vivo and in 
vitro . 

Plants that have been used to study the biochemistry of chain elonga- 
tion in maturing oilseeds are high erucate strains of Brassica napus (rape),=' 
Brassica campestris (turnip rape),'' and Brassica juncea (mustard rape),^ as 
well asLimnanthes alba (meadowfoam),^ Tropaeolum majus (nasturtium),' 
and Crambe abyssinica.^ They are all annuals. Sometimes, the choice of an 
oilseed that can be harvested only at a particular time is unavoidable, as in 
the study of wax ester biosynthesis, which is unique to Simmondsia 
chinensis (jojoba). In this case the project becomes distinctly seasonal. 

An important preliminary step is to monitor the development of the 
seeds (Fig. 1). This will ensure that maturing seeds are harvested at the 
time of maximum lipid biosynthesis. Lipid content (expressed as mass of 
total or neutral lipid per seed or per gram of fresh or dry seed weight) 
should be measured as a function of days after flowering (field grown 
plants) or days after pollination (greenhouse plants). Appelquist has re- 
viewed the topic of lipid accumulation during seed maturation." Several 
extraction procedures for lipids are suitable. Soxhlet extraction of the 
dried, ground seeds with petroleum ether will yield neutral lipids. Alter- 
natively, total lipids can be extracted from fresh tissue by homogenizing in 
petroleum ether-isopropanol, 3 : 2 {vIvY^ or in chloroform-methanol, 2 : 1 
(v/v),*^ followed by the appropriate aqueous salt wash. Extraction of the 
seed residues should be exhaustive. Developing seeds are ideal for bio- 
chemical studies when about 10-50% of the eventual neutral lipid mass 
has been deposited. Over this period the in vivo incorporation of 
[l->''C]acetate into lipids is generally at a maximum (Fig. 1). Seeds picked 
later have much endogenous lipid. This can cause severe mass overload- 
ing during radio-chromatographic analysis. 

3 R. K. Downey and B. M. Craig, J. Am. Oil Chem. Soc. 41, 475 (1964). 

•* L. A. Appelquist, y. Am. Oil Chem. Soc. 50(2) (1973). 

= A. Benzioni and M. R. Pollard, unpublished observations, 1979. 

« M. R. Pollard and P. K. Stumpf, Plant Physiol. 66, 649 (1980). 

^ M. R. Pollard and P. K. Stumpf, Plant Physiol. 66, 641 (1980). 

« R. S. Appleby, M. I. Gurr, and B. W. Nichols, Eur. J. Biochem. 48, 209 (1974). 

9 J. B. Ohlrogge, M. R. PoUard, and P. K. Stumpf, Lipids 13, 203 (1978). 
«» M. R. Pollard, T. McKeon, L. M. Gupta, and P. K. Stumpf, Lipids 14, 651 (1979). 
" L. A. Appelquist, in "Recent Advances in the Chemistry and Biochemistry of Plant 
Lipids" (T. Galliard and E. I. Mercer, eds.), pp. 247-286. Academic Press, New York, 
1975. 

'* A. Vogel, "A Textbook of Practical Organic Chemistry," 4th ed., p. 137. Longmans 
Green, New York, 1978. 

A. Hara and N. S. Radin, Anal. Biochem. 90, 420 (1978). 

J. Folch, M. Lees, and G. H. Sloane-Stanley, J. Biol. Chem. 226, 497 (1957). 



ARTICLE 



Cloning of A1 2- and A6-Desaturases from Mortierella alpina 
and Recombinant Production of y-Linolenic Acid 
in Saccharomyces cerevisiae 

Yung-Sheng Huang^^ Sunita Chaudhary^^ Jennifer M. Thurmond^, 
Emil G. Bobik, Jr.^ Ling Yuan'''^ George M. Chan* Stephen J. Kirchner^ 
Pradip Mukerji^^% and Deborah S. Knutzon* 

^oss Products Division, Abbott Laboratories, Columbus, Ohio 43215, and '^algene LLC, Davis, California 95616 



ABSTRACT: Two cDNA clones with homology to known de- 
saturase genes were isolated from the fungus Mortierella alpina. 
The open reading frame in one clone encoded 399 amino acids 
and exhibited A12-desaturase activity when expressed in Sac- 
charomyces cerevisiae in the presence of endogenous fatty acid 
substrate oleic acid. The insert in another clone contained an 
open reading frame encoding 457 amino acids and exhibited 
A6-desaturase activity in 5. cerevisiae in the presence of exoge- 
nous fatty acid substrate linoleic acid. Expression of the A^ 2- 
desaturase gene under appropriate media and temperature con- 
ditions led to the production of linoleic acid at levels up to 25% 
of the total fatty acids in yeast. When linoleic acid was provided 
as an exogenous substrate to the yeast cultures expressing the 
A6-desaturase activity, the level of y-linolenic acid reached 10% 
of the total yeast fatty acids. Co-expression of both the AS- and 
A1 2-desaturase cDNA resulted in the endogenous production 
of Y-l>nolenic acid. The yields of 7-linolenic acid reached as 
high as 8% of total fatty acids in yeast. 

Paper no. L8157 in Lipids 34, 649-659 Ouly 1999). 



The primary products of fatty acid biosynthesis in most or- 
ganisms are 16- and 18-carbon compounds. The relative ratio 
of chain lengths and degree of unsaturation of these fatty 
acids vary widely apiong species. Mammals, for example, 
produce primarily saturaifecl and monounsaturated fatty acids, 
while most higher plants produce fatty acids with one, two, 
or three double bonds. Indeed, polyunsaturated fatty acids, 
such as linoleic acid (A9, 12- 18:2) and a-linolenic acid 
(A9,12,15-18:3), are regarded as essential fatty acids in 
the diet because mammals lack the ability to synthesize them. 
However, when ingested, mammals have the ability to 
metabolize linoleic and a-linolenic acids to form the n-6 and 
n-3 families of long-chain polyunsaturated fatty acids 
(LC-PUFA), respectively. These LC-PUFA are important 



'Present address: Maxygen, Santa Clara, CA 9505 10. 
*To whom concspondence should be addressed. 
E-mail: inadip.mukerji@rossnutrition.com 

Abbreviations: GO, gas chromatography; GLA, y-linolenic acid; LC-PUFA, 
long-chain polyunsaturated fatty acid; MS, mass spectrometry; PGR, poly- 
merase chain reaction; TPI, triose phosphate isomerase. 



cellular components conferring fluidity to membranes and 
functioning as precursors of biologically active eicosanoids 
such as prostaglandins, prostacyclins, and leukotrienes which 
regulate normal physiological functions (1). 

In manmials, the formation of LC-PUFA is rate-limited by 
the step of A6-desaturation, which converts linoleic acid 
to y-linolenic acid (GLA, A6,9,12-18:3) and a-linolenic acid 
to stearidonic acid (A6,9,12, 15-18:4). Many physiological 
and pathological conditions have been shown to depress 
this metabolic step, and consequently, the production of 
LC-PUFA (2). However, bypassing the A6-desaturation via 
dietary supplementation with GLA can effectively alleviate 
many pathological diseases associated with low levels of 
PUFA (1). This beneficial effect prompted GLA-rich oil to 
become a much-demanded commodity. GLA is currently 
used in the treatment of eczema and mastalgia (1). At the 
present time, the predominant sources of GLA are oils from 
plants such as borage, evening primrose and black currant, 
and from microorganisms, such as Mortierella spp., Mucor 
spp. and cyanobacteria (3). However, these GLA sources are 
not ideal for dietary supplementation due to high fluctuations 
in availability, production/purification costs, unpleasant tastes 
and odors, and safety concerns. Thus, interest in developing 
more reliable and economical alternative sources of GLA and 
other LC-PUFA is growing. 

The primary product of fatty acid biosynthesis in most 
plants and yeast is the monounsaturated, 1 8-carbon oleic acid. 
Two desaturation steps, at the A12 and A6 positions, neces- 
sary for the production of GLA from oleic acid, are shown 
below. 



A9-18:l- 

Oleic 



A12-desatiinise 



-»A9.12-18:2.- 
Linoleic 



A6-desaturasc 



^A6. 9. 12 -18: 3 
a ' Linolenic 



The cDNA clones encoding A12-desaturases were isolated 
from several species of cyanobacteria (4,5) and plants includ- 
ing Arabidopsis (6), soybean (7), and parsley (8). A6-Desat- 
urase-encoding cDNA were isolated from cyanobacteria (9), 
borage (10), and nematode (11). These enzymes, as well as 
nunierous examples of A15/n-3 desaturases (12,13), are all 
believed to be integral membrane proteins utilizing an acyl- 
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lipid substrate, and with the exception of the cyanobacterial 
enzymes, requiring cytochrome b5 for the electron transport. 
The deduced amino acid sequences of these desaturases show 
a good deal of similarity, most notably in the region of three 
histidine-rich motifs that are believed to be involved in iron 
binding (14). 

In this study, we utilized the filamentous fungus, 
Mortierella alpina^ as the source for desaturase genes. This 
approach was based on the fact that this fungus is rich in 
linoleic acid and its LC-PUFA n-6 metabolites, GLA, and ar- 
achidonic acid (A5,8, 11,14-20:4). Using a strategy based on 
degenerate oligonucleotide primers designed to amplify se- 
quences present at the second and third His boxes of known 
acyl lipid desaturases (14), we previously isolated a cDNA 
clone encoding the Af. alpina A5-desaturase (15). A similar 
strategy utilizing different degenerate primers was also suc- 
cessful in amplifying the same A5-desaturase (16). Such poly- 
merase chain reaction (PGR) approaches are limited, how- 
ever, by the degree of homology of the target cDNA to the 
particular primers and conditions utilized. In order to achieve 
a more thorough examination of the fatty acid desaturases 
present in the fungus, an alternate approach of sequencing 
random cDNA clones was also employed. Since it was known 
that the previously characterized membrane-bound A12- and 
A15-desaturases, as well as the available cyanobacterial A6- 
desaturase sequences, showed significant amino acid se- 
quence conservation, particularly in the histidine-rich regions, 
it was postulated that potential Mortierella desaturase cDNA 
could be recognized based on their deduced amino acid se- 
quences. Indeed, this was the strategy that led to the identifi- 
cation of a borage A6-desaturase (11) and a castor oleate 12- 
hydroxylase (17). Because the first histidine-rich motif (His- 
box) region can occur from 80 to 160 amino acids (240-^80 
bp) from the N-terminus, and the third region can be roughly 
250-300 amino acids (750-900 bp) into the desaturase se- 
quence (14), 300-400 bp of DNA sequence information ob- 
tained from the 5'-end of full-length clones might not contain 
the regions of highest homology among desaturases. Since at 
the time this work waa^initiated, no desaturase sequence was 
identified from Af. alpina ai^d it was not known how much ho- 
mology they might display to known sequences, we chose to 
obtain information from the internal sequences of cDNA 
clones instead of the 5'-end of full-length clones. 

Expression of the Mortierella desaturase candidates was 
carried out in baker's yeast, Saccharomyces cerevisiae. This 
eukaryotic organism was previously shown to be a suitable 
host containing the necessary cofactors for functional expres- 
sion of acyl-lipid desaturases. Saccharomyces cerevisiae con- 
tains a A9-desaturase capable of producing monounsaturated 
palmitoleic and oleic fatty acids, but does not carry out fur- 
ther desaturations. Expression of an Arabidopsis FAD2 cDNA 
in 5. cerevisiae resulted in the production of linoleic and 
A9,12-hexadecadienoic acids from the endogenous oleic acid 
and palmitoleic acid substrates, respectively (18,19). By cul- 
turing 5. cerevisiae in the presence of exogenous fatty acid 
substrates, functional expression of a nematode A6-desaturase 



(11) and a fungal A5-desaturase (15,16) were demonstrated. 
In this study, we report the isolation of A12- and A6-desat- 
urases from Af. alpina. Simultaneous expression of these two 
genes in 5. cerevisiae drives production of GLA at levels of 
up to 8% of the total fatty acids without the requirement for 
exogenous fatty acid substrates. 

MATERIALS AND METHODS 

cDNA library construction. Synthesis of Af. alpina cDNA 
was described previously (15). Briefly, double-stranded 
cDNA were sized fractionated by colunm chromatography. 
The two fractions containing the largest cDNA were pooled 
and packaged to produce a "full-length" library (M7+8) con- 
taining ca. 6 X 10^ clones with an average insert size of 1.77 
kb. An additional library, (Mil), to be used for random se- 
quencing was constructed by packaging a fraction containing 
smaller cDNA, which would most likely contain less than 
full-length clones as well as full-length copies of shorter mes- 
sages. The average insert size of this library was 1.1 kb; the 
titer was 240 pfri/jiL. Library screening and plaque purifica- 
tion were carried out with the M7+8 library using standard 
protocols as described previously (20). 

Random DNA sequencing. The cDNA-containing plas- 
mids were excised from the X-ZipLox clones following man- 
ufacturer's reconmiendations (Life Technologies, Gaithers- 
burg, MD). Bacterial cells were plated on ECLB plates con- 
taining 50 ^g/mL penicillin. DNA sequence was obtained 
from the 5'-end of the cDNA insert and compared to the Na- 
tional Center for Biotechnology Information nonredundant 
database using a BLAST server. 

Plasmid construction. For expression in yeast, the Af. 
alpina cDNA clones for A6- and A12-desaturase genes were 
first modified to create EcoRI and Xhol restriction sites adja- 
cent to the start and stop codons, respectively. Each gene was 
amplified from the respective cDNA clone using PGR with a 
pair of primers which have homology to the 5'-end and 3'-end 
of the gene (restriction sites underlined): 

RO-192 rS^TAGGC TGAATTCA TGGCrGCTGCrCCCAGTGTGAGGACG-30 

and 

R0.193 (5'-AACTCK2CICffiAfiTrACTGCGCCTTACCCATCTTGGAG 

are forward and reverse primers with homology to the se- 
quences around the initiation and termination codons of A6- 
desaturase (Ma524). respectively (shown in bold). 

RO-194 (5^TAarrCX2AAri£ATGGCACCTCCCAACACT^ 

and 

RO-195 (V-AArr;G TCTmAGT TACTTCTTGAAAAAGACCACGTCrCC-30 

are forward and reverse primers homologous to the 5'- and 
3'-ends of the A12-desaturase (Ma648), respectively. 

The EcoRUXhol putative desaturase gene fragments were 
cloned into the vector pYES2 (Invitrogen, San Diego, CA) 
for inducible expression under the control of GALl promoter 
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in yeast. This vector contains a selectable marker gene which 
confers uracil prototrophy in the host. The plasmids contain- 
ing the putative A6-desaturase (Ma524) and A12-desaturase 
(Ma648) genes were designated as pCGR-5 and pCGR-7, re- 
spectively. To construct pCGRll and pCGR12, the A6- and 
A12-desaturase coding regions were isolated from pCGR5 
and pCGR7, respectively, as EcoRl-Xhol fragments and 
cloned into the pYX242 vector (Novagen, Madison, WI) di- 
gested with EcoRl-Xhoh The pYX242 vector contains a 
marker gene for selection of leucine prototrophy in the host 
and has the promoter of TPI (yeast triose phosphate isomerase 
gene), which allows constitutive expression. Co-expression 
of recombinant A6- and A12-desaturases can be achieved by 
simultaneous introduction of pCGR5 with pCGR12 or 
pCGR7 with pCGRll in the appropriate host requiring both 
uracil and leucine for growth. 

Yeast transformation and expression. Different combina- 
tions of pCGR5, pCGR7, pCGRll, and pCGR12 were intro- 
duced into a host strain of 5. cerevisiae^ SC334, which con- 
tains a mutation {re g 1-501) that alleviates catabolite repres- 
sion of GALl promoter (21). Transformation was done using 
the PEG/LiAc protocol as described previously (22). Trans- 
formants were selected by plating on synthetic medium plates 
with appropriate selection (21). Cells containing pCGR5 and 
pCGR7 were selected on media lacking uracil, whereas the 
pCGRll and pCGR 12 constructs were selected on media 
lacking leucine. 

Results from our preliminary studies showed that expres- 
sion of genes (A6- and A12-desaturases) was enhanced when 
cultures were grown in synthetic medium at 15**C. In the pres- 
ent study, colonies of transformants were tirst grown 
overnight at 30°C in synthetic media. Overnight cultures (2-4 
mL) were then used to inoculate 100 mL of minimal media 
for studying the activities of recombinant desaturases. Galac- 
tose was added at a final concentration of 2% to the medium 
for induction of GALl promoter in the strains containing 
pCGR5 and pCGR7. When the enzyme substrate was pro- 
vided as the exogenous fatty acid, the fatty acid was supple- 
mented at a concentration of 25 |iM. The culture was grown 
for 48 h at 15°C, andsubs;fquently harvested by centrifuga- 
tion. Cell pellets were washed once with sterile dd Ylf> to re- 
move the media. The host strain transformed with vector 
alone was used as a negative control in all experiments. 

Fatty acid analysis. The extraction of the yeast lipids fol- 
lowed the procedures described previously (15). Briefly, 
washed yeast pellets were extracted with 15 mL of methanol 
and 30 mL of chloroform containing 100 ^g of tridecanoin. 
After extraction, the yeast lipids were first saponified, and the 
liberated fatty acids were methylated. The distribution of fatty 
acid methyl esters was then analyzed by gas chromatography 
(GC) using a Hewlett-Packard 5890 II Plus gas chromato- 
graph (Hewlett-Packard, Palo Alto, CA) equipped with a 
flame-ionization detector and a fused-silica capillary column 
(Supelcomega; 50 m x 0.25 mm, i.d., Supelco, Bellefonte, 
PA). In the present study, the quantity of the product formed 
and the rate of conversion of substrate to product (conversion 



rate = product/(substrate + product)) were calculated to re- 
flect the expression/activity of a given desaturase in this yeast 
cell assay system. 

The identification of a given novel fatty acid was verified 
by gas chromatography-mass spectrometry (GC-MS) using 
a Hewlett-Packard mass selective detector (model 5920) op- 
erating at an ionization voltage of 70 eV with a scan range of 
20-5(X) Da. The mass spectra of new peaks were compared 
with those of authentic standards (Nu-Chek-Prep, Elysian, 
MN) and those in the database NBS75K.L (National Bureau 
of Standards). 

RESULTS 

Isolation of a AS-desaturase-lUce cDNA clone from M. alpina. 
DNA sequence was obtained from the 5'-end of cDNA in ran- 
domly picked clones from the Af. alpina Mil library. Se- 
quence of one such clone, Ma524, exhibited limited homol- 
ogy to a known Synechocystis A6-desaturase (9) when com- 
pared to the databanks. Overall, the level of homology was 
low (BLAST score 114; P 4.7 x lO"*^). The partial cDNA was 
used as a probe to isolate a full-length clone, designated 
pCGN5532. from the M7+8 library. The cDNA insert in 
pCGN5532 (GenBank accession AFl 10510) was 1617 bp 
and contained an open reading frame encoding 457 amino 
acids flanked by 70 and 75 bp of 5'- and 3'-untranslated re- 
gions, respectively. The deduced amino acid sequence is 
aligned to that of borage A6-desaturase (10) in Figure 1. The 
three "His-boxes," known to be conserved among membrane- 
bound desaturases (6,14), were found to be present at amino- 
acid positions 172-176, 209-213, and 395-399 in this se- 
quence. Similar to other membrane-bound A6- and A5-desat- 
urases, the final "HXXHH" histidine box motif was found to 
be QXXHH (11,15,16). The predicted amino-acid sequence 
from this clone is similar to the A6-desaturases from the Syne- 
chocystis spp. and Spirulina spp. (9), the borage A6-desat- 
urase (10), the nematode Caenorhabditis elegans (11), and a 
cytochrome ZjJ/desaturase fusion protein from sunflower (23), 
As reported for other A5/A6 desaturases, the amino terminus 
of the protein encoded by pCGN5532 was also homologous 
to cytochrome b5 proteins. 

Isolation of a Al 2 -desaturase -like cDNA clone from M. 
alpina. DNA sequence obtained from the 5'-end of another 
random clone, Ma648, showed homology to the soybean n-6 
desaturase (7). The homology of the partial M, alpina se- 
quence was again relatively weak (BLAST score 110, P 2.0 x 
10~^). Analysis of the open reading frames beginning at the 
5'-end of Ma648 indicated that the first possible methionine 
was in frame +1 which was the frame that showed desaturase 
homology. Alignment of this open reading frame to 5'-se- 
quence of other A12-desaturases indicated that the M. alpina 
Ma648 clone was full-length. This cDNA was designated 
pCGN5533, and no other corresponding clones were ob- 
tained by library screening. The 1488 bp cDNA insert in 
pCGN5533 (GenBank accession AFl 10509) contains 78 bp 
of 5'- and 1 13 bp of 3'-noncoding sequences flanking an open 
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reading frame encoding 399 amino acids. Figure 2 shows the 
alignment of the deduced amino acid sequence of pCGN5533 
to the FAD2 (microsomal A12 desaturase) from Arabidopsis 
(6). The three His-boxes are again present at positions 
111-115, 147-151, and 338-342. Unlike Ma524, no homol- 
ogy to cytochrome b5 sequence is present on the N-terminus 
of this clone. 

Functional expression of M. alpina desaturase clone 
pCGN5533 (Ma648) in yeast. In order to assess the functional 
specificity of the various M. alpina desaturase clones, the 
coding regions were expressed in 5. cerevisiae using the in- 
ducible GALl promoter found in the commercial vector 
pYES2. As described previously (15), recombinant yeast cells 
were grown in the presence of various fatty acids in order to 
provide substrates for desaturases involved in LC-PUFA pro- 
duction. The deduced coding region of pCGN5533 (Ma648) 
was inserted into the yeast expression vector pYES2 to create 
pCGR7. Fatty acid profiles of lipid fractions from yeast 
grown in the absence of exogenous fatty acid substrate show 
that two novel fatty acids were produced in SC334(pCGR7) 
(Fig. 3 A), The first fatty acid showed a mass peak m/z = 266 
(the expected molecular ion of 16:2), a retention time of 13.48 
min, and a fragmentation pattern identical to those of 
A9,12-16:2 (Fig. 3B). The second novel fatty acid exhibited a 
retention time (17.28 min) in GC (Fig. 3 A), mass peak (m/z = 
294) and fragmentation pattern in GC-MS (data not shown) 
identical to those of the authentic linoleic acid (A9,12-18:2). 
These findings indicate that the endogenous oleic acid 
(A9-18:l) was converted to linoleic acid (A9,12-18:2) by a 
A12-desaturase activity expressed from the plasmid pCGR7. 
The rate of conversion was found to be 71.4% (Table 1). 

Functional expression of M. alpina desaturase clone 
pCGN5532 (Ma524) in yeast. The recombinant yeast 
SC334(pCGR5), containing the Ma524 cDNA, was grown in 
the presence of exogenous linoleic acid (A9,12-18:2) which 
is the substrate for A6-desaturation. Analyses of the fatty acid 
profile in the yeast lipid fraction indicate that the exogenous 
linoleic acid was incorporated into lipids of both nontrans- 
formed and transfomjed jjeast (Fig. 4). However, GC analysis 
revealed the presence of a novel fatty acid in the 
SC334(pCGR5) yeast that was not present in yeast trans- 
formed with vector alone. This novel fatty acid had a reten- 
tion time of 17.96 min in GC (Fig. 4). Mass peak m/z = 292 
and fragmentation pattern of this fatty acid in GC-MS were 
identical to those of the authentic GLA (A6,9,12-18:3) stan- 
dard; however, the fragmentation pattern was different from 
that of the a-linolenic acid (A9, 12, 15-1 8:3) standard (data not 
shown). Thus, the Ma524 cDNA expressed from pCGR5 en- 
codes a A6-desaturase. The expressed enzyme converted 
29.4% of the incorporated linoleic acid to GLA (Table 1). 

Since there were no traces of a-linolenic acid (A9,12,15- 
18:3) produced from the exogenous linoleic acid (A9,12-18:2) 
in the recombinant yeast strains, it is suggested that the en- 
zyme produced by pCGR5 does not possess A15-desaturase 
activity. In addition, when exogenous a-linolenic acid was in- 
cluded in the growth medium, 3.9% of the incorporated 



TABLE 1 



Production of Linoleic Acid and GLA in Yeast Lipid Fraction 





Total fatty acids^ 


Oleic 


Linoleic 


GLA*^ 


S034 containing 


(Mg) 


(wt%) 


(wt%) 


(wt%) 


pYES2 


440.1 


23,2 






pCGRS'' 


497.1 


10.2 


25.1 


10.3 


pCGR7 


460.9 


10.0 


24.8 




pCGRll/pCCR7 


340.8 


10.2 


10.1 


7.9 


pCGR5/pCGR12 


367.9 


6.7 


7.0 


6.6 



^he volume of culture used for lipid extraction was 100 mL. 

'Exogenous linoleic acid (25 pM) was added. 

*^No a-linolenic acid was detected. GLA, Y-llnolentc acid. 

a-linolenic acid (A9,12,15-18:3) was converted to stearidonic 
acid (A6,9,12,15-18:4). The identity of stearidonic acid was 
verified by both GC and GC-MS (data not shown). This find- 
ing further confirms the enzyme to be a A6-desatiirase. 

In the absence of exogenous linoleic acid, the lipid frac- 
tion of the yeast strain expressing the A6-desaturase cDNA 
produced two novel fatty acids (Fig. 5A). The first novel fatty 
acid showed a mass peak m/z = 266, which is the expected 
molecular ion of 16:2. Although the GC-MS fragmentation 
patterns of this novel fatty acid and the authentic A9, 12- 16:2 
were similar, they were different in intensity (Figs. 3B 
and 5B), and retention time (12.89 vs. 13.48 min) in GC 
(Figs. 3A and 5A). Since this novel fatty acid was produced 
in the presence of the A6-desaturase, it was most probably 
the A6,9-16:2. The second novel fatty acid produced in 
SC334(pCGR5) had an identical retention time (16.95 min) 
in GC (Fig. 5 A), mass peak m/z = 294, and fragmentation pat- 
tern in GC-MS to that of the A6,9-18:2 standard (data not 
shown). 

Production of GLA, As shown above, the recombinant 
A12- and A6-desaturases were effective in converting their 
substrates (endogenous oleic acid and exogenous linoleic 
acid) to their respective products, linoleic acid (Fig. 3) in 
SC334(pCGR7) and GLA (Fig. 4) in SC334(pCGR5). We 
were interested in determining the feasibility of producing 
GLA in a recombinant yeast strain in the absence of exoge- 
nously added fatty acid substrates. The biosynthesis of GLA 
from the endogenous oleic acid in S. cerevisiae would require 
the simultaneous expression of A12- and A6-desaturases. In 
order to allow co-expression of the A6- and A12-desaturase 
cDNA, they were cloned under the control of the constitutive 
TPI promoter into the leucine-selectable vector pYX242 to 
create pCGRll and pCGR12. Both combinations of promot- 
ers GALl and TPI were assayed for production of GLA. The 
co-expression of pCGRll (containing A6-desaturase gene 
under the control of TPI) and pCGR7 (containing A12-desat- 
urase gene under the control of GALl) resulted in ca. 7.9% 
of GLA in total fatty acids of SC334(pCGR7, pCGRll) 
(Table 1). The rates of conversion from oleic acid to linoleic 
acid and from linoleic acid to GLA were ca. 50 and 44%, re- 
spectively. In SC334 (pCGR12, pCGR5) containing the A6- 
desaturase gene behind GALl and the A12-desaturase gene 
behind TPI, a level of 6.6% of GLA was found in the total 
fatty acids (Table 1). In these recombinant yeast strains, the 
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FIG. 3. (A) Gas chromatographic analysis of fatty acid methyl esters (FAME) from the lipid fraction in yeast contain- 
ing pYES2 or pCCR7. Solid and open arrows indicate the fatty acids A9,1 2-1 6:2 and A9,1 2-1 8:2, respectively, pre- 
sent in SC334(pCGR7) cultures. (B) Gas chromatography-mass spectrometry (GC-MS) analysis of the nove peak 
(identified by the solid arrow in Fig. 3 A) in yeast carrying pCGR7, The fragmentation pattern of the first novel peak 
(top) was'compared with that of the authentic A9,1 2-1 6:2 standard (bottom). pYES2 contained only vector whereas 
pCGR7 6ontai«ed the coding region of the M, alpina A1 2-desaturase cDNA clone, pCGN5533. All yeast strams 
were grown in the minimal medium. See Figure 1 for other abbreviations. 



conversion rate for both oleic acid to linoleic acid and linoleic 
acid to GLA was about 50%. Among them, SC334(pCGRll, 
pCGR7) produced a higher level of GLA, and the GLA accu- 
mulated predominantly in the phospholipid fraction (data not 
shown). Hence, co-expression of Af. alpina A6- and A12-de- 
saturase genes under the control of independent promoters in 
yeast resulted in de novo synthesis of GLA. 

Comparison of desaturase amino acid sequences. The 
availability of three desaturase sequences from Af. alpina was 
used to examine the interspecies and interclass relationships 
of these sequences. The amino-acid sequences between the 
first and third His-boxes of representative desaturases were 
used to construct a similarity dendrogram (Fig. 6). Two major 



classes of desaturases can be discerned in this dendrogram. 
One class contains the A12/n-6 and A15/n-3 desaturases, 
while all known examples of A5- or A6-desaturases fall into a 
separate class. Although only the central amino-acid core se- 
quence was used in the alignments, all the desaturase se- 
quences with an N-terminal cytochrome fe5-like sequence 
cluster into the latter sequence group. The presence of the cy- 
tochrome b5 extension appears to be related to the function- 
ality of the desaturase and not the source of the gene; of the 
three desaturases from Af. alpina, only the A5- and A6-desat- 
urases have the fused cytochrome sequence. In addition, all 
of the members of this class contain the H-Q substitution in 
the third His box. 
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DISCUSSION 

We utilized a random sequencing approach to identify cDNA 
clones encoding fatty acid desaturases from the fungus, M, 
alpina. Partial sequence obtained from the 5'-end of ran- 
domly-selected clones was compared to the databanks, and 
homologies to known acyl-lipid desaturases were noted. This 
report describes the isolation of two different desaturase-like 
cDNA clones encoding A6- and A12-desaturases. These 
clones were identified in a first-phase sequencing of -1200 
cDNA. In addition to the two clones described in this work, 
the first phase of sequencing also revealed clones correspond- 
ing to the A5-desaturase originally obtained by heterologous 
PGR (15,16) and clones homologous to the yeast stearoyl- 
CoA desaturase (24) (data not shown). A more thorough se- 
quencing effort o^' 5400 additional cDNA resulted in the 
identification of 13 sequences with homology to stearoyl- 
CoA desaturases, 8 A6-desaturases, 9 A5-desaturases, and 5 
A12-desaturases. It should be noted that several of the ran- 
dom clones encoding A5- and A6-desaturases actually showed 
cytochrome b5 matches in the BLAST results, due to the 
highly homologous cytochrome domain at the N-terminus of 
these desaturases. Had this domain not been previously iden- 
tified, several of these cDNA might not have been recognized 
as desaturases in such a mass sequencing effort. This is an im- 
portant point to keep in mind when interpreting BLAST re- 
sults of all sequences; the presence of one highly conserved 
domain may lead to mis-annotation of the sequence. 

The comparison of the desaturase amino-acid sequences 
shown in Figure 6 indicates that the M. alpina A5-desaturase 
is more closely related to the cyanobacterial A6-desaturases 
than to the plant and animal A6-desaturase sequences. The ul- 
timate significance of this is hard to evaluate, due to the lack 
of other A5-desaturases for comparison. It should, however. 



be noted that the C elegans ORF on cosmid T13F2 (GenBank 
accession number Z81122) that was proposed to be a possible 
A5-desaturase (16) shows more similarity to the Af. alpina 
A6-desaturase sequence than to the Af. alpina A5-desaturase 
sequence (data not shown). 

In the present study, we showed that the recombinant en- 
zyme expressed by a Af. alpina desaturase-like gene (Ma648) 
in pCGR7 converted A9-16:l to A9, 12- 16:2 and oleic acid 
(A9-18:l) to linoleic acid (A9,12-18:2) (Fig. 3A). These find- 
ings clearly demonstrated that this gene encodes the A12-de- 
saturase. We also showed that the recombinant enzyme ex- 
pressed by another M. alpina gene (Ma524) in pCGR5 con- 
verted n-6 fatty acid linoleic acid (A9, 12- 18:2) to GLA 
(A6,9,12-18:3) (Fig. 4). When an n-3 fatty acid, a-linolenic 
acid (A9,12,15-18:3), was used as the substrate, the 
SC334(pCGR5) produced the expected product, stearidonic 
acid (A6,9,12.15-18:4) (data not shown). In the absence of 
linoleic acid as substrate, this recombinant enzyme could con- 
vert die endogenous A9-16: 1 to A6,9-16:2, and oleic acid (A9- 
18:1) to A6,9-18:2 (Fig. 5A). These findings demonstrate that 
this gene encodes the A6-desaturase. 

In order to evaluate the feasibility of producing GLA— a 
high-value PUFA in this microorganism, we co-expressed the 
genes encoding A6- and A12-desaturases in yeast. When both 
genes were presented in a single construct in yeast, and ex- 
pressed from a single promoter, GALl, none of these trans- 
formed yeast strains produced a significant amount of GLA 
(data not shown). Therefore, it seemed likely that two inde- 
pendent promoters would be needed for the concurrent ex- 
pression of these two desaturases. Indeed, when these desat- 
urases were co-expressed in trans from two independent pro- 
moters, GALl and TPI, the production of GLA reached as 
high as 8% of the total lipids in yeast grown without exoge- 
nous substrates (Table 1). By the action of two separate pro- 
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FIG. 5. (A) Gas chromatographic analysis of FAME from the lipid fraction in yeast containing pYES2 or pCGRS 
grown without exogenous substrate. Solid and open arrows indicate novel fatty acids A6,9-16:2 and A6,9-18:2, re- 
spectively^ (B) GC-MS analysis of the novel peak (identified by the solid arrow in panel A) in yeast carrying pCGRS. 
The fragmentation pattern of the first novel peak was compared with that of the authentic A6,9-1 6:2 standard pYES2 
contained only vector, whereas pCGRS contained M, alpina cDNA clone encoded with A6-desaturase. See Figures 
1 and 3 for abbreviations. 



moters, these enzymes were able to effectively convert (ca. 
50%) their respective substrates to products. 

In summary, we isolated two cDNA from M, alpina encod- 
ing the A6- and A12-desaturase genes using a random se- 
quencing-based strategy. The identities of the two cDN A con- 
firmed by functional expression and analysis in a widely used 
microorganism, baker's yeast. By introducing the two re- 
quired desaturases (A6- and A12-) under the control of inde- 
pendent promoters in yeast, we developed a novel approach 
to synthesize GLA. 
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cells, although the number of endogenous XMyoD transcripts 
is much less than that normally found in the myotomes 
(Table 1). 

Do animal caps isolated from XAfyoD-injected embryos show 
other signs of muscle differentiation? The XAfyoD- injected 
animal caps are histologically indistinguishable from uninjected 
or XAfyoDi/^P-injected controls (Fig. Za^c^d), By contrast, 
uninjected animal caps induced by vegetal tissue elongate and 
contain large blocks of muscle (Fig. 36). We obtained intense 
labelling of this muscle tissue using the 12/101 anti-muscle 
antibody^\ but saw no labelling above background of the 
XAfyoZ>-injected animal cap cells (Fig. 3). We conclude that no 
differentiated muscle is formed by XMyoD- injected animal caps. 
Thus, animal cap cells that contain as much cardiac actin RNA 
as normal myotomal cells do not express the full myogenic 
programme, but continue to differentiate as epidermis. Whole 
XAfyoD- injected embryos also develop relatively normally, 
becoming tadpoles with substantially normal external and inter- 
nal structures, including a variety of differentiated cell types 
(data not shown). 

We have shown that XMyoD can activate a muscle gene to 
its normal level in animal cap cells. MyoD can bind to sites in 
the promoters of muscle genes^, and we note that, in addition 
to a CArG sequence that is essential for transcription^^, there 
are potential XMyoD-binding sites^ located further upstream 
in the Xenopus cardiac actin promoter (M. V. Taylor, N.D.H. 
and T. J. Mohun, unpublished data). The lack of muscle 
differentiation in XAf>^oZ>-injected animal caps could be caused 
by a failure to maintain a sufficiently high concentration of 
XMyoD protein. Alternatively, as enough XMyoD has been 
supplied to activate the cardiac actin gene to its normal level, 
it may be that muscle did not differentiate because other 



myogenic factors, not themselves activated by XMyoD, are 
required to divert these embryonic cells from their normal path- 
way of differentiation. q 
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Enhancement of chilling tolerance 
of a cyanobacterium by genetic 
manipulation of fatty 



acid desaturatiohi^ 



HaPme Wada, Zoltan Gombm & Norfo^^^^ 

National Institute for Basic Biology. ^^odaiKOkiaki 444, Japan 

The sensitivity (or toierance) of pfants to chilling determines their 
choice of natural habitat and also limite'tfae worldwide production 
of crops. Although the moleculu mechanisin f or chilling sensitivity, 
has long been debated, no definitive oonciusion has so far been- 
reached about its nature. A probable hypothesis^f^ however, is 
that chilling injury is initiated^by^phasejtransiti of; 
cellular membranes, as demoi^trated^fo/^ which' 
serve as a model system for the plaiit'cells^l Because the phase 
transition temperature depends on the degree of unsaturation of: 
fatty acids of the membrane Upids^, it j^pie^ that the chilling ' 
t lerance of plants can be altered by genetically 
fatty-add desatnration by introducing double bonds into fatty 
adds of membrane lipids. Hejre m npbrt t^ a gene ' 

for the plant-type desaturati n"<temed/if^)l The introducti n 
f this gene from a chilling-resistant cyan bact rium, Synecha- 
cysds PCC6803, int a chilliiig-sensitiye^ cyan bacterium, 
Anacystis nidtilans^ increases the t lerance f the redplent to 1 w'' 
temperature. ' . ^^;;S^'^5vgSg^^i:!:2 

A mutant in fatty-acid desaturatdbiT of membrane lipids of 
the transformable cyan bacterium^ Synecliocysiis PCC6803, was 
is lated as described previous!)^. This'mutahtyftenned Fadl2, 



is defective in the activity of desaturation that introduces a 
second double bond at the A" position of the Cig fatty acids 
of membrane lipids*. It grows much slower at low temperatures 
(such as 22 X) than the wild type*. 



TABLE 1 Composition of major fatty adds of total membrane lipids in 
various strains of Synechocystis PCC6803 

i^.-iS'l;]. v \ .C - Fatty acid (mol 96) . 



t;:J, r- . Strain - ; 
Wild type 

Mutant (Fadl2) 1; U 

Transformant of Fadl2 - - 
?r with desA - k - sSB.^^- 
(Bluescript/1.5 kbp) ^. .; i 
Transformant of Fadl 2.' {-^ 

with des4 I 

p (pTZ19R/8kbp) -V k 
Transformant of wild type \ - 
• with desA::Y<rW 

(Biuescript/1.5 kbp: :Km^ 
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; t trace amount (<0.4%). Wild type, mutant (Fadl2), and transformants 
of Fadl2 with desA were grown photbautoUophlcally at 34 °C as described 
prevlously^^. Transfomiant of wild type with desAr.Km' was cultivated in 
the same way but in the presence of 5M.gmr^ kanamycin In the culture 
medium. The disrupted gene. desArMm', was constructed by interrupting 
the desA in the cloned Bluescrlpt/1.5 kbp at the Hin6\\\ site by the amino- 
glycoside ^-phosphotransferase gene (the kanamycin-reslstance (Km') car- 
tridge) originating from the bacterial transposon Tn5 (ref. 16). Fatty ac\^ 
of the total membrane lipids were aialysed according to Sato and Murata ^ 
The values are the means obtained in three Independent experiments, ano 
the deviation of values was within ±1.0%;^ -^ _ ' . 
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1 AATTCIUauUX:AATTCGCTTCCT«maulTGGC^^ 
>1 TACCT<aiGTATTAATTCCTAGGCACCWaulAanTG(Xr^ 
M TJUUICIUUUITCTGTCCGMXTOCATTT«^ 

)7 ATtaiCT«XJUXATTCCOCC»rPGACACCAAC^ 

1 >tetTlirAla«irlleProProl«uThrPrcmirValllirProBertffliProA^^ 
S7 ATOOXUTCTOUUlCrACAAGA^ 

t1 Il€AlaAspLeuLysI«,GlnA8pIlaIleLysthrl«ffr^ 

****5C*5**5C***COCTGGGl-l ILIUI 1 1 lUirrAOCCTAGOGGCGKXGGOCGTGG»AT 
11 i-ysAlaSerLysAlaTrpAlaSerValI*uIl«l»rt*uClyAlaIleiaava^ 

7 "MrarTATTTATCnxrCTOTACTGCTTGOCa 
.1 «*uClyileIl«yrl«*roTrpTyrCy8l*ul*ol^ 

7 °°CTTAAOGOGG QC X:i lt^TTtm:(MCCATatCTCTG«XaraC I CCI 1 A U>T AAAAAA 
1 Alai^uThrClyAlaPheValvalGlyEisAspCysClyHisAr^J^^ 

•> CGCTCGCTCAATGATTTACTGCGACaTATC GC I il'l t^ T UXX TOTCTAOacmgg*T 
1 Ar9TrpvalAsnAspLeuVal61,HisllaAlaPI«Alal*ol«ull«iyrP^ 

1 SerttlAr9l*uI^ifiAspB±sHisaisl«uHis»hrAsrtLyslleClttValAspto 

7 GCXaGGGATOOCTGGAGreTOGAAGCTITCCAAaX^^ 

1 *la»PAsi«roTiT»SerValCluAlaPheGlnAXaSeri>roAl^^ 

I TyrAryAlaileArgGlyProPhcT i tjT x pllutSlySerllePheHisTrpSerLeuMet 
[ "f™^**f™»UUaiTCOCCAAAa^^ 

I HxsPhetysLeuSerAfiitfheAlaGlnArgAspAi^AfinLysValLysLeuSeril^ 

I ValValPl»l*uPheAlaAlalleAlaPheProAlal*uIlelleThrthrGlyValSp 
r GGTTTOrrcAAATTTTGGCTAATGC^^ 

I GlyPh«ValLysPheTrpLeuiieU»roTiTa*«ValTyiflisPheTrplte 

f ACCATTCmxaCCACACCATTCCCGAA A TTCCA I u' l J. ^ i i Y ^ ilT ryrCftTTgnigTCrp 
ThrileValHi8Hi8ThrileProGluIleArgPl«Arg»oAlaAlaAsp^^ 
CCrGAAGa:CAOTA^ 

AlaCluAlaClnl^uAaoClyThrValHisCysAsiffyrProArgTrpValGl^^^ 

CysHisAsplleAsnValHisileProHisHiBUeuSerValAlalleProSerTyrAsn 

7 CTAOGACTAGCaaaaaA(aTTAAAAGAAAACTCGGGAC Ci 1 i AH riA CGAGCGCACC 
1 £«»»Ar9l^laHisGlySerLeuLysGluAsnTiT«lyProPheLeuTyrGluArgTh^ 

1 TTTAACTGGCAATTAA1tX:AACAAATTAGTGGGCAATGTCArrrATAn»UXCCQ 

I PheAsnTrpGlnLeuHetGlnGlnileSerGlyGlnCysHisLeuTyrAspProGluHis 

' «^*COGC*C«™»GCTCCCTGAAAAAA^ 

I GlyTyrAr^ThrPheClySerLeuLysLysVal*** 

f ACOCATGAnGGTCAGTA. ^ 

f AAATTAA CrATCTTG CAAtgaQOCATOGACCTCTA', m^TO«>xwu# 
' CAAAAAAACTTTCTAAGXtSOOCnCATQGGIIQOQCTTC^^ 

' AAAATGAATCCTAAACBCAACCTGCATATTCTOCAA CfcA 1 A AU UaatBOCAAGBTAOOOC 
f TTC 



jIjTo clone a gene required for the desaturation at the A^^ 
position, a genomic library of Synechocystis PCC6803 was con- 
^cted in plasmid pTZ19R. The genomic library was screened 
|orcloncs capable of complementing Fadl2 in the growth at 
Jpw temperature and the desaturation at A*^ position according 
transformation method developed by Ozelzkalns 
fal^og rad . A plasmid clonis with an S-kilobase pair (kbp) 
SllS5'J?™?d pTZ19R/8kbp, was isolated (Table 1). The 
nopiologous recombinations as described by Williams' betweeii 
Eh^^^'VSkbp and the mutated gene of the chromosome of 
, ^adl2 may have taken place. The plasmid, pTZ19R/8 kbp. Was 
^^sted with Aval to obtain a LS-kbp fragment, which could 
f^^complement Fadl2. The 1.5-kbp fragment was subcloned 
nio a Bluescript plasmid (termed Bluescript/1.5 kbp), and its 
dPm?'**® sequence was determined. In only one of the six 
,^ble reading frames was there an open-reading frame. This 



Wasr r — ~* ~" 

^(^"^^i corresponded t 351 amino-acid residues 

tesf**' ^® 1.5-kbp fragment also contained a 5' upstream 
g^?'^^^ bp and a 3' downstream regi n f 270 bp. This 
^KVL -""^^ ^/wA) encodes either a plant-type desaturase, 
Wt"* introduce the second cis d uble bond at the A'^ 
'<8Sri^*^" ^ ^^"^ ^^^^ ^ ^ membrane glycerolipids, r a 
^5<^^^ desaturase (see below). The hydropathy profile 

Hjjffle deduced amino-acid sequence f the desA product (Rg. 

'vbr 347r - 13 SOTBy^BER 1990^ - 
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FIG. 1 a. Nucleotide sequence and deduced amino-acld sequence of dtesA 
a gene for fatty-acid desaturation at the A" position of fatty acids in 
Synechocystis PCC6803. The deduced amino-acid sequence is numbered 
with 1 for the first methionine, b. Hydropathy profiles of the deduced 
ammo-acid sequences of the desA product and its putative membrane- 
spanning regions Indicated by solid bars (top graph), and the stearoyl-CoA 
desaturase from rat fiver^ (bottom graph). 

METHODS. Nucleotide sequence was determined by the dideoxy chain- 
temiinatlon method using double-stranded DNA templates^®. The unidirec- 
ttorial deletion of the plasmid was performed according to the instructions 
of the manufacturer of the Bluescript DNA sequencing system (Stratagene 
Cloning Systems). Hydropathic index was calculated according to the 
algorithmofKyte and DooIitUe"forawindow size of 19 amino-acld residues. 



16) is similar to that of the stcaroyl-CoA desaturase from rat 
liver*. The desA gene product has two clusters of hydrophobic 
regions which are putative membrane-spanning domains (Fig. 
10). But, the sequence similarity between the desA product and 
stearoyl-CoA desaturase from rat liver is <30% at the nucleotide 
level and <10% at the amino-acid level. 

We transformed the Synechocystis mutant, Fadl2, with desA 
included in Bluescript/ 1.5 kbp and pTZ19R/8 kbp to examine 
whether the desA product is responsible for the fatty-acid 
desaturation. Table 1 shows that the wild type and the transfor- 
mants contained high levels of 18:1(9), 18:2(9,12) and 
18 : 3 (6, 9, 12) fatty acids (fatty acids are represented by numbers 
of carbon atoms and double bonds, before and after a colon, 
respectively, and the positions of double bonds, counted from 
the carboxy terminus (A), are indicated by numbers in 
parantheses). In Fad 12, 18:1(9) and 18:2(6,9) significantly 
increased, whereas 18:2 (9, 12) and 18:3 (6,9, 12) decreased t 
trace amounts. It is noteworthy that Fadl2 lacked the fatty acids 
having the double bond at the A" position, and that this double 
bond was recovered by transformation with desA. Similar 
changes in the desaturati n f fatty acids at the A*^ position 
were observed in all lipid classes, monogalaict syl diacyl glycerol, 
digalactosyl diacylglycerol, sulphoquinov syl diacylglycerol 
andphosphatidylglycer l(datan tsh wn). To examine whether.' 
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the d«A gene product is necessary for fatty-acid desaturation 
f?i[^- "? *!!?»-tyP« Synechocystis PCC6803 according 
n,^ r r* * ^'"'P*"*^ '^^^^ ^hich was produced by 
insertion of a kanamycin-resistance cartridge (Km'). The trans- 
format of wild type with desAr.Km' had the same fatty-acid 
composiUon as that of Fadl2 (Table 1). 

Ajiother transformable cyanobacterium, Anacystis nidukms 
pTrf Tt desA according to Kuhlemeier 

and van Arkel">. It is noteworthy that A nidulans is a member 
pt the group of cyanobacteria, which are completely defective 
in desaturation at the A'^ position" ". Bluescript/1.5 kbp was 
digested with Sad to obtain a fragment containing the total 

mo"nUC,oVr''%'int''''r'r- '"«™*« was subcioned 
into pUC303 (ref. 10), a shuttle vector between A nidulans and 

Bschenchm colt, at the Sac\ site of the streptomycin-resistance 

gene but in the opposite direction (termed pUC303/rfesA) The 

wild-type and the transformant with pUC303 alone (control) 

fatririHc'^ H- ' • = ' ^t^' *^ = <' W « the irincYpa 

Sr.. H "'/'."^t*"'*.*?*" cyanobacterium can introduce 
2W„ th. t""" r C'" (Table 

V t, u '^T"*"* PUC303/d«A fatty acids having 
two double bonds. 16:2(9.12) and 18:2(9.12), emerged to 
sigmficant levels at the expense of 16: 1 (9) and 18 : 1 (9) The 
transformation of A nidulans R2-SPc with the disrupted deM 
by the Km cartridge was also carried out as above. The transfor- 
W.*""** PUC303/d«/l::km' had a fatty-acid composftion 
li/J ^^/'!' ^'"^ 'yP" ^"'^ ^he transformant witS 
put.303. The lipid class composition and the lipid-to-protein 

olir,nv'^"^*4^'*^"'^.''y transformation with pUC303 and 
r™!;f»» -f.^ 7^!*^°'*""'****'"* demonstrate that the trans- 
formam with desA has acquired the desaturase attivity in 
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Composition of major fatty adds of total membrane liDidr~ 
various strains of Anacystis nidulans R2-^ 



fatty acid (mol %) 



r.:.s^. . _i. V,,.-: Temperature of exposure, 'C 



20 



202 



Strain 
Wild type 
Transformant with 

PUC303 
Transformant with 

VUC303/desA 
Transformant with 

pUC303/dlBs/»::Km' 





16:1 


16:2 


16:0 


(9) 


(9.12) 


51 


36 


0 


51 


37 


0 


47 


29 


5 


50 


33 


0 



3 
5 
5 



18:1 18:2 

(9) (9,12) 
6 0 

5 0 

2 6 

9 0 



Wild type was grown photoautotrophically at 34 °c as descrihoH . " 

**4 gave the same result « aJ^ * ""'^'"^ transformants with 

introducing the second double bond at the A'^ position of fattv 
acids. and that the 166-bp upstream sAuenL SnLrns tLe 
promoter region of this gene. Mucnce contains the 

A nidulans is sensitive to chilling temperature' " ". At growth 
temperature, both plasma and thylakoid membranes JJe^to^e 



!?^fiti^^!?/t.°^ T ^^T"^^ °" Pfwtosynthetic activity and phase 
transition of membrane lipids in the transformants with and wtttout flba4 

^iW T^^i^i^^f °' ^'^'^ to 5 ^ on the photosynthetic 
laO nLm ^ exposure to 5"C corresponded to ahout" 

f^Cu Srr/ f T''k»"' transformants. Each poW 

E^Xx^rtn i^f- "^'^ independent experimente. ft 

^XbI^^V^ temperature for 20 min on the absorbance change 
e-Cto M^^n^^* T ^^'^tion spectrum caused by exposure to 
wondh^^ « J^ ^^TTl « concentrauon corre- 

e^t^^T^ • '"*»«"«tently Obtained trar^for- 
manis gave essentially the same result. 

hi?!> * 1 ^? Wropriate time the cells were rewarmed to 34 X and 
t»on spectrum were measured acoordir^g to Ono and Murata^^. 
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liquid-crystalline state. With decrease in erowth te™ .» 

the thylakoid membrane fim goes intothe nh».» '"'"P^^ture, 
only with reve«ible deterioxarn ofX cfsXeTS'f^^^^^^^ 
decrease m temperature, the plasma membrane Sen5?heSf 
separated state, in which leakage of the cvTt^luT^V, ^^f^ 
relative molecular mass into die medium k^^^^^^ 
physiological activities' * " irreversibly damages 

J^"" iJifo*"*' ^n'' transformant with dUC301 „f a 
nidulans R2-SPc grown at 34 °r u,-™ j . Fz^, °^ ^ 

' i ; u ^ *e transformant with pUC303/d^i rfw 

eu mm (Fig. 2a). On exposure to 2 "C for «in nT;- »i. t 
thetic activity was decrLe^to i)oJ|rtSe?.^;.^ 
pUC303 and to 75% in fh» .JT c transformant with 

These obse^atioS ^^mfnstSr^^^ 

sep'^tSrtronrXmr'iir-^-*^^^^^^^ 

nW«/««. can be ^udied'^iy Than^rta'^f \ ^ 
of carotenoids' "-" lie Dh2^?^n.w* "^'^JP"*'" spectrum 
branes of the t,^ntfS^^!^^^°'' °^^'' P^^'^'' 

ranges 8-4oc^with a m^dpoiS a^Vrd'S 'c wT^^^ 
^saided as ^u-SnTf^ ^th^iTZdT^^^^^^^^^^ 
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c^rcTo^rs:^^^ 

cya'JoVaS S^VeTnTated" ^''^ ^""""^ ^"'^"-^ ^ 
fatty-acid desa^^atfon Bec^ute ^ ^"m^""" manipulation of 
operateinthechrngWu!!ofhf„H '","»'^^•nechanism could 
to improve their ch?i rna7„? ^*^?'^'^' " ™8ht be p^^^^ 
fatty-add d cStura£"^ nianijulati^ 

□ 
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Partition of tRNA synthetases 
into two classes based on 
mutually exclusive sets^ 
of s quence motifs ^^^^^^^^^^ - 



!fZSHS^— - ^^^^ 

synthetases rcinPc t m ^ " * ^fened to as class I 
ne?S.tr„4?T^J^^_^^^^-«S. ArgRS. ValRs! 
*qoences ('HIGH' aSVMSXf^.!??'*..*""™®" consensus • 

?f Eschenckia JS KlS^ ^if "rtJ"*" sequence ' 
hhomT^^* *T' molecular mass^ 

*^tirii£tr~ T"' **• T*'RS and SerRS. W- 

(i^l 2 »d"°,? ^ of PheRS. mese & 

"•ant „ " *V 3)* "« a search throneb the entl» j-^ 

■"•^b^llew^wtt^, , i .^* be strongly correlated n the 
'^'T) r y i^ fM S'"^ ■ occniring either n the 2' OH 
UKsl ' !!!^ l*? last TcIeoSS 

'!^LU(^7„„c^ ,»ransf nn and complement the- 

l"ltd?fV«fe^i^^^f'.^'*0. a temperature-sensitSe 
^tl^ activity*. From complementing trans- 

\ ^O"- 347 ■ 13 SEPTEMBER 1990A-7. . , - - r-, : 



plasmid PUC18 vector anrf t^i ^ Agestion, ligated into 
TTie resultant bacterial Lit "^^^ ^^'C 

ProRS activity of S t?ne ..k J^ a 0"?.^..''^ 
fragment wasLbcl^ned in^n Ml , A, ^-^-ki'-base (kb) DNA 
exonuclease* and seoJ«r.H'^*?"P'«^P'»^»'>y <»i^^^^ with 
menise\The2 795h«^ - T7 DNA poly- 

an open readiS^e en^';^ ^""^ "'ntah, 
relauVemole<^larr4(W«^^^^^ 

that estimated by SDs!^aGeS^ ! 'u ^er^nient witi, 
protein sequent dedui^fS^^^t^ 

independently confi™^^i; pnmaiy structure was 

residues of the SnS'JrS^irrn'^^ 
RNA 5' termini were dctSJl^**'''°u' niessenger 
. oftheDNAregiono1^^o?^e%rA'*'/"'VS-' '^^^^^^^ 

on position +1,739 imd fo«^S;/HK symmetry, centred 

This structural fL^ "T \^ ™° s*^*" T residues. 

termination sSnalTtadi^^r?;?",?" * 'ho-independent 

GeneticCompS'^':J\*S^G*'^i^^^^^ 

using the algorithm of Rr.^Hlj Vi^r P'^^^l^ Program 

a stop site for Sa 12^'* Tnfonov», which predicted 

Fig. l.^4:°L^™;i'^°'r*"** at position +1.755 shown in 

seq«;^sS^\xtL'? "^^^^^ wiUi other aaRS 

in the togi^ylofFirlZ ^J'^^^^ as defined 

of PtorI) an?S Ser4 (m% of 

of conservative substitutC^ A I r ^ 37-2% 
teins is presenter s TS:.t^'''?^*°^ P™- 
C-terminal domai^^^^n s^^I^ wh " * '"nger 

cxtensi natiteNf^- '''""^as ThrRS has a long 

Is tMs^SL'^imSedTn'^^^ 

translation, a uniqTf^S« ^f^t *°»^'**' " ^ t''^ ^^'RS 
this mechanism pr JeK~"2" 

TTirRSandatRNA'^^^ticSj^^l" " the 

f the AUG iSati n ^'^'S^^'^T'^ ' cated upstream 
several results" poiirted t«^i5 recently, h wever. 

oper n. of a furthS^d m,^^ *f « «he level of lArS 

lurther d main interacting witi, ThrRS and whose 
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Strains of Saccharomyces cerevisiae bearing the 
olel mutation are defective in unsaturated fatty acid 
(UFA) synthesis and require UFAs for growth. A pre- 
viously isolated yeast genomic fragment complement- 
ing the olel mutation has been sequenced and deter- 
mined to encode the A9 fatty acid desaturase enzyme 
by comparison of primary amino acid sequence to the 
rat liver stearoyl-CoA desaturase. The OLEl struc- 
tural gene encodes a protein of 510 amino acids (251 
hydrophobic) having an approximate molecular mass 
of 57.4 kDa. A 257-amino acid internal region of the 
yeast open reading frame aligns with and shows 36% 
identity and 60% similarity to the rat liver stearoyl- 
CoA desaturase protein. This comparison disclosed 
three short regions of high consecutive amino acid 
identity (>70%) including one 11 of 12 perfect residue 
match. The predicted yeast enzyme contains at least 
four potential membrane-spanning regions and several 
shorter hydrophobic regions that align exactly with 
similar sequences in the rat liver protein. An olel gene- 
disrupted yeast strain was transformed with a yeast- 
rat chimeric gene consisting of the promoter region 
and N-terminal 27 codons of OLEl fused to the rat 
desaturase coding sequence. Fusion gene transform- 
ants displayed near equivalent growth rates and mod- 
est lipid composition changes relative to wild tjrpe 
yeast control implying a significant conservation of A9 
desaturase tertiary structure and efficient interaction 
b tw en the rat desaturase and yeast cytochrome b^. 



In animal and fungal cells, monounsaturated fatty acids are 
synthesized via an aerobic process from saturated fatty acid 
precursors by a microsomal membrane-bound three-compo- 
nent enzyme system involving cytochrome 65, NADH-depend- 
ent cytochrome 65 reductase, and the A9 fatty acid desaturase 
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(1-3). This complex catalyzes the insertion of a double bond 
■ between carbons 9 and 10 of the saturated fatty acyl sub- 
strates, palmitoyl (16:0)- and stearoyl (18:0)-CoA, yielding the 
monoenoic products palmitoteic (16:1) or oleic (18:1) acids. 
Although higher eukaryotes contain polyunsaturated fatty 
acids in their membranes, either synthesized endogenously 
via A12 and A15 desaturase reactions or obtained from their 
diet, the A9 reaction accounts for all de novo unsaturated 
fatty acid (UFA)^ production in Saccharomyces cerevisiae (4). 

Isolation and characterization of fatty acid desaturase en- ' 
zymes has proved difficult due to their extraordinary hydro- 
phobic nature and tight association with membranes. Al- 
though fatty acid desaturation was first described using the 
yeast A9 desaturase system, only animal A9 enzymes have 
been successfully purified to homogeneity (5, 6). At a genetic 
level, only the DNA sequence for the rat liver and mouse 
adipocyte genes have been reported and analyzed (7, 8). Those 
genes were found to encode proteins with 92% identical amino 
acid sequences. . 

The A9 desaturase from rat liver has been most extensively 
characterized. It is a protein consisting of 358 amino acids of 
which 62% are hydrophobic (7). The functional enzyme has 
an obligate phospholipid requirement and contains one mol- 
ecule of non-heme iron (5). Effects of chemical modification 
on enzyme function has suggested that arginyl and tyrosyl 
residues are involved in the binding of the negatively charged 
CpA moiety of the substrate and in the chelation of the iron 
prosthetic group, respectively (9). A truncated , rat liver A9 
• enzyme missing 26 residues from the N terminus is also 
membrane-boimd and functional (10). 

Yeast mutants bearing the olel allele require oleic acid for 
growth and were believed to produce a defective A9 desaturase 
suggesting that OLEl was the structural gene encoding the 
enzyme (11). Recently, we isolated and characterized a yeast 
genomic fragment containing the OLEl gene of S. cerevisiae 
(12). Replacement of the wild type gene in haploid cells with 
a disrupted form of that fragment resulted in a UFA-requiring, 
nonreverting phenotype. . ; : 

In this paper we report the DNA sequence of the S. cere- 
visiae OLEl gene and compare the deduced amino acid se- 
quence of the yeast A9 fatty acid desaturase with that of the 
rat liver stearoyl-CoA desaturase primary sequence. Although 
the proteins encoded are highly divergent, the rat A9 desatu- 
rase gene functions efficiently in S. cerevisiae in place of the 
native yeast gene. Furthermore, predicted structural features 
of the two proteins suggest a model for the topology of the A9 
fatty acid desaturase in the ER membrane. 



^ The abbreviations used are: UFA, unsaturated fatty acid; ORF, 
open reading frame; ER, endoplasmic reticulum; kb, kilobase(s). 
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MATERIALS AND METHODS 
DNA Manipulations, Media, and Stain— AW recombinant DNA 
manipulations were according to sUndard methods (13, 43). Plasmid 
amplifications and bacterial transformations were performed using 
either Escherichia coli strain HBlOl or XLl Blue (Stratagene). Yeast 
transformations were by the method of Ito et al. (14). Growth analysis 
was performed in synthetic dextrose medium supplemented with the 
appropriate amino acids (23). The genotype of yeast strain L8-^4C 
is: a, olelA::LEU2y leu2'3, leu2-dU2, ura3-52, his4), 

DNA Sequencirw— Overlapping DNA fragments lying within the 
OLEl open reading frame were subcloned into pBluescript vectors. 
(Stratagene) in two orientations for sequencing in either direction. 
Single-stranded DNA squencing templates were prepared by methods 
supplied by Stratagene. The M13(~20) primer was hybridized to 
single-stranded DNA templates and DNA sequencing was performed 
by the dideoxy chain termination method of Sanger et aL (15) using 
the modified T7 DNA polymerase, Sequenase (U. S. Biochemical 
Corp.). In two cases, OLEJ internal oligonucleotides were synthesized 
to facilitate DNA sequence analysis. 

DNA Sequence and Deduced Primary Sequence Analysis— 
DNA sequence and the deduced primary sequence analysis was per- 
formed using the Genetics Computer Group (GCG) sequence analysis 
software package (16). Amino acid sequence of the rat liver stearoyl- 
CoA desaturase was obtained from GenBank. Primary sequence 
comparison of the yeast and rat Uver A9 desaturases was performed 
using the BestFit analysis program. Hydropathy analysis was accord- 
ing to Kyte and Doolittle (17). J , , ,n7 
Construction of Modified olel Alleles— Alleles olel-33 and olel-107, 
containing stop codons in the 5' region of the coding sequence were 
constructed similarly. YEp352/OLE4.8 was partially digested with 
Saa or Ncol, the cohesive ends were made blunt, and plasmids were 
religated. FoUowing amplification in E. coU, plasmid samples were 
subject to restriction enzyme analysis. Candidates lacking the rele- 
vant restriction site were .subject to DNA sequence analysis for 
verificatioii. , ^, ■ , 

Construction of Episomal and Centromeric Plasmids Bearing the 
Rat Uver Stearoyl-CoA Desaturase Gene— A 1.2-kb rat liver A9 de- 
saturase cDNA fragment encoding residues 3-358, stop codon, and 
136 base pairs of the 3 '-untranslated region was removed from 
plasmid pDs3-358 (10) by digestion with BamHl and and in- 
serted into the multiple cloning site of episomal plasmid YEp352. A 
1 0-kb yeast genomic fragment encompassing the promoter region, 
translation initiation codon, and the furst 27 codons of the OLEl was 
isolated via Hindlll/Saa digestion and ligated in-frame with the rat 
desaturase fragment in YEp352. In this final construct, an eight- 
codon Unker derived from the multiple cloning site regions of pUC8 
and YEp352 separates the yeast N-terminal codons from the rat 
desaturase sequence. The predicted size of the fusion gene product is 
391 amino acid residues. The yeast-rat fusion gene was then recovered 
via HindlU/Dral digestion and ligated into YCpSO using Hindlll and 
Nrul restriction sites. Plasmids bearing the fusion gene were amph- 
fied in E. coli strain XLl-Blue and used to transform the yeast olel 
gene-disrupted strain L8-14C. 

Lipid Isolation and Fatty Acid Analysis— Upids were extracted 
from whole yeast cells by direct saponification (18). Fatty acid methyl 
esters were prepared by transmethylation with boron tnfluonde (19) 
and analyzed by gas chromatography using a 30-meter capillary 
column SP-2330 (Supelco) in a Hewlett-Packard 5710A chromato- 
graph as previous^ reported (12). 

RESULTS AND DISCUSSION 

General Features of the OLEl Structural Gene— In a pre- 
vious report it was shown that a cloned 4.8-kb Hindlll yeast 
genomic fragment, but not two subclones of that fragment 
terminating at a central Kpnl region, complemented the olel 
mutation of S. cerevisiae (12). From that Kpnl junction, 
overlapping subclones were used to "walk" through the ob- 
. served open reading frame (ORF) in both directions yielding 
the sequence strategy presented in Fig. 1. Both strands were 
sequenced through the entire ORF without ambiguity. 

The DNA and deduced amino acid sequence of OLEl and 
flanking nucleotide sequence is shown in Fig. 2. The ORF is 
1530 nucleotides long. Translation of the entire ORF would 
produce a 510-amino acid polypeptide having an approxi- 



mately molecular mass of 57.4 kDa containing 49.2% hydro- 
phobic and 25.7% charged (10.0% acidic and 15.7% basic) 
amino acid residues. No consensus iV-glycosylation sites are 
present in the deduced amino acid sequence of OLEl and the 
protein does not appear to contain a cleavable N-terminal 
signal sequence. 

Yeast TATA promoter elements are commonly found 40- 
120 base pairs upstream from transcription initiation sites 

(20) with an average mRNA leader sequence of 52 nucleotides 

(21) The OLEl promoter region has two consensus TATA 
promoter elements (TATAAA and TATATA) located at po- 
sitions -30 and -156 relative to the ORF. A transcription 
initiation event, directed from the TATATA element located 
at -156, could yield a transcript having features consistent 
with the above observations. However, transcription initiation 
directed from the TATAAA promoter element located at -30 
could result in an atypically short, nontranslated leader se- , 
quence relative to the first in-frame ATG. Furthermore, there 
are three additional in-frame ATG codons within the first 
400 base pairs of the OLEl ORF at positions 56, 61, and 116 
that could also serve as potential translation start sites (see 
Fig. 1). Due to the close proximity of the first ATG codon to 
the TATA promoter element at -30 and comparison with the 
rat desaturase (discussed below) that showed no significant 
similarities in the first 140 amino acids, we were prompted to 
test for functional OLEl products initiating from these down- 
stream sites. Two modified olel alleles were constructed (see 
"Materials and Methods") that shifted the ORF and intro- 
duced translation stop codons at either position 33 {olel-33) 
or 107 (olel-lO?), Both in-frame stop codons were positioned 
before the next available ATG sequence. The olel gene- 
disrupted yeast strain L8-14C, bearing the deletion allele 
olel A::LEU2, was transformed with either of the above alleles 
on an episomal plasmid and tested for the ability to grow in 
the absence of exogenous UFAs. (Strains bearing this olel 
allele were previously shown (12) to completely lack A9 de- 
saturase activity as determined by product formation and 
have limited and finite growth potential (4-5 generations) in 
UFA-free medium.) In both cases the transformed strain grew 
only when UFAs were added to the growth medium, which is 
consistent with the first in-frame ATG codon functioning as 
the primary site of translation initiation. 

Yeast and Rat Liver A9 Enzyme Amino Acid Analysis^A 
computer search of homologies to all current entries in 
GenBank/EMBL protein data bases identified a single data 
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Fig 1 OLEl restriction map and sequencing strategy . Pri- 
mary restriction sites mapping the 4.8-kb yeast DNA fragment con- 
taining OLEl: Bg. Bglll; Bt, BstEII; H, Hindlll; Hp, Hpal; X, Kpnl; 
R EcoRl; S, 5aII; Sm, Smal; P, Pstl; and X, Xhql. The position and 
direction of the 1530-base-long ORF encoding the A9 enzyme is 
indicated by the large arrow above the map. Small arrows below the 
map indicate by size and direction the OLEl subclones used to 
sequence the entire ORF and flanking regions. 
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1 ACACTCAACAAACCTTATCTAGTGCCCAACCAGGTGTGCTTCTACGAGTCTTGCTCACTC 60 

61 AGACACACCTATCCCTATTCTTACGGCTATGGGGATGGCACACAAAGGTGGAAATAATAG 120 

121 TAGTTAACAATATATGCAGCAAATCATCGGCrCCTGGCTCATCGAGTCTTGCAAATCAGC 180 

181 ATATACAlMAlAiaTGGGGGCAGATCTTGATTCATTTATTGTTCTATTTCCATCTTTCC 240 

241 TACTTCTGTTTCCGTTTATATTTTGTATTACGTAGAATAGAACATCATAGiTAATAGATAG 300 

301 TTGTGGTGATCATATJaiaA&CAGCACTAAAACATTACAACAAAGATGCCAACTTCTGGA 360 

MetProThrSerGly 

361 ACTACTATTCAATTGATTGACGACCAATTTCCAAAGGATGACTCTGCCAGCAGTGGCATT 420 
ThrThrlleGluLeuXleAspAspGlnPheProLysAspAspSerAlaSerSerGlylle 

421 GTCGACGAAGTCGACTTAACGGAAGCTAATATTTTGGCTACTGGTTTGAATAAGAAAGCA 480 
ValAspGluValAspLeuThrGluAlaAsnlleLeuAlaThrGlyLeuAsnLysLysAla 

4 81 CCAAGAATT6TCAACG6TTTTG6TTCTTTAATGGGCTCCAAGGAAATGGTTTCCGTGGAA 540 
ProArqIleValAsnGlyPheGIySerLeuMetGlvSerLv3Gl uMet:V alsgrValci» 

541 TTCGACAAGAAGGGAAACGAAAAGAAGTCCAATTTGGATCGTCTGCTAGAAAAGGACAAC 600 
. PheAspLysLys61yAsn61uX.ysLysSerAsnl«tiAspArgLeuLeuGluLysAspAsn 

601 CAAGAAAAAGAAGAAOCTAAAACTAAAArkACATCTCCGAACAACCATGGACTTTGAAT 660 
GlnGluLysGluGluAlaLysThrLysIleHisIleSexGluGlnPTQTrpThrLeuAsn 



1141 GTCATTCCAACTCTTATCTGTGGTTACTTTTTCAACGACTATATGGGTGGTTTGATCTAT l^nn 
VallleProThrLeuIleCysGlyTyrPhePheAsnAspTyrMetGlyGlyLeulleTyr 



. 1201 GCCGGTTTTATTCCTGTCTTTGTCATTCAACAAGCTACCTTTTGCATTAACTCCATGGCT l>tn 
AlaGlyPhelleArqVaiPheVallleGlnGlnAlaThrPhecyslleAsnSerMetAla 



661 AACTGGCACCAACATTTGAACTGGTTGAACATGGTTCTTGTTTGTGGTATGCCAATGATT 720 
As nT rpHl sGlnHi sLeuAsnTrpLe uAsnHciV a 1 LcuValCy sGlyMetPr oMet I le 



721 GGTTGGTACTTCGCTCTCTCTGGTAAAGTACCTTTGCATTTAAACGTTTTCCTTTTCTCC 780 
GlyTrpTyrPheAlaLeuSerGlyLysValProlAuHiaLeuAsnValPheLeuPheSer 



781 GTTTTCTACTACGCT<n'CGGTGGTGTrrCTATTACTGCCGGTTACCATACATTATOGTCT 840 
ValPheTyrTyrAlaValGlyGlyValSerlleThrAlaGIyTyrHlsArgLeuTrpSer 

841 CACAGATCTTACTCCGCTCACTCGCCATTGAGATTATTCTACGCTATCTTCGGTTGTGCT 900 
HisArgSeTTyrSerAlaHlsTrptProLeuArgl^uPheTyrAlallePheGlyCyaAla 

901 TCCGTTGAJtf;GGTXXOCTAAATGGTGGGGCC»CrrCTCAC»GAATTCAC»^^ 960 
S'srValGluGlySerAlal.ysTrFfTrpGly Hi sSer HisAr gl leHi sHl sAr gTy ^hr 

961 3ATACCTTGJU3AGATCCTTATGACGCTCGTAGAGGTCTATGGTACTCCCACATGGGATGG 1020 
AspThrLeuArgAspProTyrAspAlaAxgArgGlyLeuTrpTyrSerHisMetGlyTrp 

1021 ATGCTTTTGAAGCCAAATCCAAAATACAAGGCTAGAGCTGATATTACCGATATGACTGAT 1080 
Met LeuLeuLysProAsnProLy sTyr Lys AlaArgAlaAspIleThrAspHetnirAsp 



1261 CATTACATCGGTACCCAACCATTCGATGACAGAAGAACCCCTCGTGACAACTGGATTACT 1320 
Hi sTy r IleGlyThrGlnP roPheAspAspAr gArgThrPr oArgAspAanT rpi leTh r 

1321 GCCATTGTTACTTTCGGTGAAGGTTACCATAACTTCCACCACGAATTCCCAACTGATTAC 13 SO 
AlalleValThrPheGlyGluGlyTyrHisAsnPheHisHisGluPheProThrAspTyr 

1381 AGAAACGCTATTAAGTGGTACCAATACGACCCAACTAAGGTTATCATCTATTTGACTTCT 1440 
ArgAsnAlalleLysTrpTyrGlnTyrAspProThrLysValllelleTyrLeuThrSer 

1441 TTAGTTGGTCTAGCATACCACTTGAAGAAATTCTCTCAAAATGCTATTGAAGAAGCCTTG 1500 
LeuValGlyLeuAlaTyrAspLeuLysLysPheSerGlriAsnAlalleGluGluAlaLeu 

1501 ATTCAACAAGAACAAAAGAAGATCAATAAAAAGAAGGCTAACATTAACTGGGGTCCAGTT 1S60 
I leGlnGlnGluGlnLy sLy s IleAsnLysLy sLysAla Ly si leAsnTrpGlyPr oVa 1 

1561 TTGACTGATTTGCCAATGTGGGACAAACAAACCTTCTTGGCTAAGTCTAAGGAAAACAAG 1620 
LeuThrAspLeuProHa tTrpAspLysGlnThrP heLeuAl a LysSer Ly sGluAsnLy s 

1621 GGTTTGGTTATCATTTCTGGTATTGTTCACGACGTATCTGGTTATATCTCTGAACATCCA 1680 
GlyLeuValllelleSerGlylleValHisAspValSerGlyTyrlleSerGIuHi sp ro 

1681 GGTGGTGAAACTTTAATTAAAACTGCATTAGGTAAGGACGCTACCAAGGCTTTCAGTGGT 1740 
GlyGlyGluThrLeuIleLysThrAlaLeuClyLysAspAlaThrLysAlaPheSerGly 

1741 GGTGTCTACCGTCACTCAAATGCCGCTCAAAATGTCTTGGCTGATATGAGAGTGGCTGTT 1800 
G ly Va ITy r ArgHi sSer A snAlaAlaG InAsnVal LeuAl aAspMe t At gVa i Al aVa 1 

1801 ATCAACGAAAGTAAGAACTCTGCTATTAGAATGGCTAGTAAGAGAGGTGAAATCTACGAA 1860 
IleLysGluSerLysAsnSerAlalleArgHetAlaSerLysArgGlyGluIleTyrGlu 

1861 ACTGGTAAGTTCTTTTAAGCATCACATTACAATAACAAAACTGCAACTACCATTAAAAAA 1920 
Thr G ly Ly sP hePheEnd 

1921 AAATTGAAAAATCATAAATTAAAAAAAAAAAAATCAATTGAATTTTTTTTTTTCATGATT 1980 

2041 TTTCATTTTAGTATTTTATTCTTCGTTATTTATGTATAGAAATTTTCATTTTCATTTAGA 2100 
210r TTCAGATTTGGTTATCTTTTTTCATTATATATCTTTTGCACTAAGTTTCAGCTTAAGTTC 2160 
2161 TATTTTTTATTTTTTTTTTCTGGGCCCTGGAGCAATAGATATGGGATGGCTTACTGCATC 2220 
2221 TCTTCAAAATTTCACAGTCATGCTCACCCTTAAGTTCTCAACCTTTT 2267 



1081 GATTGGACCATTAGftTTC(»ACA£»GACACTACATCTTGTTGAT6TTATTAACCGCTTTC 1140 : 
AspTrpThrl leArgPheGlnflisArgHlsTyrl leLeuLeuMet LeuLeuThr AlaPhe 

r%F^' ^' encoded amino acid sequence of the A9 fatty acid desaturase structural gene, 

OLEl, Two consensus yeast TATA elements preceding the 1630-base-long ORF and the first four in-frame 
methionine-specific codons are underlined. An OLEl internal region of 268 amino acids displaying significant 
identity to the rat liver A9 enzyme is delimited by asterisks. Potential membrane-spanning regions are highlighted 
with lines above nucleotide and amino acid sequences. 

basie entry, the rat liver A9 desaturase, with significantly 
similarity to the OLEl gene product (Fig. 3). The aligned 
sequences show 36% identity and are greater than 60% similar 
over the region encompassing the C-terminal 260 amino acids 
of the shorter rat protein. No significant similarities exist 
over the N-terminal 141 amino acids of the yeast and 99 
amino acids of the rat sequences. The yeast open reading 
frame extends 113 amino acids beyond the C-terminal end of 
the rat sequence. Within the region of high amino acid simi- 
larity there are three segments, having a minimum length of 
10 residues, of very high identity (>70%) beginning at OLEl 
amino acid positions 156, 331, and 368. The most highly 
conserved of these is the first region where 17 of 23 identities 
are observed including one stretch containing 11 of 12 perfect 
matches. 

The most conserved amino acid type within the compared 
region of the yeast protein is histidine with 10 of 14 (71.4%) 
residues in perfect alignment. Two other amino acid residues, 
proline and arginine, also show greater than 50% total iden- 
tity. Arginine residues of the rat liver enzyme have been 
previously identified as being- involved in substrate binding 



yeast 141 VFLFSVFYYAVGGVSITAOTHRLWSHBSYSAHWPUUJ'YAIFGCASVEGS 190 

rat 99 TLLWGIETYLlSAI£ITAGAHBU»SiWryKARI^LRIFLIIANTMAFQND 148 

yeast 191 AKWireRSHRXHHRyTimjy>PyDARRSIMySBMSHMLLI^ ! 238 

. II::.: I H . : . H I : : : 11 : 1 I : ! : : . : I I.:: 

rat 149 vyENARDHRAHHKFSBTHADPHNSRRGFFFSHVGWLLVRKHFAVKEKGGK 198 

yeast 239 .DITOKTWSMTlRFQHRHYIIJIMJ^TArVlPTLICGYFr^^ 286 

IJ.I:..:. :.|f:I.l iH .|::|||;. | :.: :: :|: . 
rat 199 U3USD1XAEKLVMF<S^SXYKPGLLIMCF1LPT^^ 248 

yeast 287 (3TRVFVI(X3ATFCZNSMAHriGTX2PFDDRRrPIU)NWITAIVTFGE6yH^ 336 

^ -1:1 -M: :1l M..I .(:|.. .|:| :..:.. .mm " 

rat 249 TTXRYTLV12JATWLWSAAiU.yGYIWYDIOT(»RENILVSLGSVGEGFHN 298 



yeast 337 FMHEFPTDVBNAIKWYQyDPTRVliyLTSLVGZAYDLKKPSfXlAIEEALI 386 

J 1 1- 1 1 ir... f.i. '. l..t | . II 
rat . 299 YHilAFPyDySASEyRmiINFTTFFIIX>lAAlJGIAyDRKKVSKAAV.IAia 347 

yeast 387 QQEQKKINKKK 397 

rat 348 KRTGDGSHKSS 358 

Fig. 3. Amino acid sequence comparison of the yeast and 
rat liver A9 fatty acid desaturases. A 257-residue internal region 
of the yeast A9 enzyme is aligned with the rat liver stearoyl-CoA 
desaturase. Comparison was prepared by the GCG sequence analysis 
program BestFit. Identical residue matches are indicated by connect- 
ing solid lines. Two or one point between residues indicate decreasing 
amino acid similarity. Percent similarity value is based on the number 
of identical and two-point amino acid comparisons. Segments showing 
high identity (>70%) are indicated with lines above those regions. 
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0 100 

Fig 4 Hydropathy analysis of yeast and rat liver A9 enzymes. Aligned Kyte-Doolittle (17) hydropathy 
profiles of the yeast and rat liver A9 desaturase proteins. The presumptive double membrane-spannmg sequences 
are indicated by bold lines above those regions. 



(9), Its significance as a highly conserved amino acid supports 
this finding. Although the role of histidyl residues m the fatty 
acid desaturases has not been examined, their highly con- 
served appearance also suggests an important contribution to 
enzyme function. 

Structural Analysis and Proposed Topology of the Yeast A9 
Enzyme— Striking simUarities were also observed in the hy- 
dropathic characteristics of the two enzymes (Fig. 4). Both 
proteins contain two long hydrophobic regions (-50 residues) 
that could potentially form two membrane traversing lopps, 
each consisting of two transmembrane segments. Chou-Fas- 
man algorithms predict j8-turn forming potential in the central 
portions of each loop in both the yeast and rat Hver proteins. 
Inspection of the primary sequences at those sites reveals the 
presence of multiple helix-breaking amino acids that could 
serve to disrupt a-heUcal structure in order to form the looped 
structures. These hydrophobic regions are at identical posi- 
tions in the aligned yeast and rat sequences. At least three 
smaller hydrophobic regions (each <7 amino acids) are also 
found at identical positions in the two proteins. The regions 
of high consecutive amino acid identity, however, are not 
within the hydrophobic sequences. The first region is located 
between the two "transmembrane loop regions," the second 
and third identity regions are located at the C-termmal part 
of the protein past the second "transmembrane loop." Neither 
extension of the N- and C-terminal domains of the yeast 
appears significantly hydrophobic and an examination of the 
amino acid distribution in those regions further suggests that 
they do not contribute to the integral membrane domains of 
the protein. A proposed model of the topology of the yeast 
protein in the ER membrane is given m Fig. 5. Assuming that 
the membrane^spanning regions are confmed to the predicted 
hydrophobic sequences that are greater than 50 amino acids 
long, the arrangement places most of the protein on the 
cytosolic side of the ER membrane. Furthermore, all three 
regions of high consecutive identity would be located on that 
side of the membrane which is consistent with its proposed 
site of action (22). 

Growth and Fatty Acid Content of Gene-disrupted Yeast 
Transformed with the Rat Liver A9 Desaturase^The signifi- 
cant sequence and predicted structural similarities observed 
between the yeast and rat A9 proteins prompted us to test 
whether the rat enzyme could functionally replace the yeast 
enzyme in S. cerevisiae, although there are additional residues 
at the N- and C-terminal ends of the yeast peptide sequence 
that are not found on the rat protein. A yeast-rat fusion gene 
was constructed (see "Materials and Methods") placing co- 
dons.3-358 of the rat gene in-frame with the initial 27 codons 




ERLIKN 



55SRs5aKs: HVOBOPHOBIC OTJENCE 

HIGH VEAST / RflT SIMILftHITV 
COKSECUTIUE IDEMTITV REGION 
UHIOUE VEftST SEQUENCE 

Fig. 5. Model for the orientation of the yeast desaturase in 
the ER membrane. The numbers identify amino acid positions in 
the yeast open reading frame. . 

of the yeast gene and promoter sequences separated by an 8- 
codon linker region. This fusion gene was placed on a multi- 
copy episomal and single copy centromere-based (CEN) vec- 
tors and introduced into the olel gene -disrupted yeast strain, 
L8-14C. Fusion gene transformants were analyzed for growth 
and lipid composition relative to the same gene-disrupted 
strain transformed with the plasmid bearing native OLEl 
gene. 

Yeast transformants bearing either the native OLEl or the 
yeast-rat fusion desaturase gene (two isolates) on an episomal 
plasmid were found cured of the UFA requirement and, sur- 
prisingly, showed identical growth rates (Fig. 6A) indicating 
significant conservation of A9 desaturase tertiary structure 
and an ability of the rat enzyme to interact with the yeast 
cytochrome 65. In addition, because the rat protein is 113 
residues shorter than the yeast desaturase at the C-termmal 
end and yet can functionally substitute for the yeast enzyme 
in S. cerevisiae, it appears that this extension of the yeast 
protein may be nonessential for catalytic functions. We can- 
not exclude the possibility, however, that the additional resi- 
dues may be involved in other functions that influence its 
catalytic efficiency or optimize interactions with other com- 
ponents of the desaturase system. 

An analysis of stationary phase cellular lipid compositions 
revealed, however, significant differences in the percentage of 
16-carbon fatty acid species in the yeast-rat fusion gene 
transformants relative to the wild type control and, as a result, 
a modest decrease in the percent totel UFA (Table I). The 
lower percentage of 16:1 and increased 16:0 species found in 




Table I 

Fatty acid composition of transformed S. cereuisiae 
Stationary phase L8-14C cells transformed with OLEl or the yeast- 
rat A9 chimeric gene on multiple (episomal) or single (CEN) copy 
number plasmids were harvested and cellular lipids analyzed as 
described under "Materials and Methods." 



Plasmid type and 
transformant 






Fatty acids 






14.-0 


16:0 


16:1 


18:0 


18:1 


UFA 


Episomal 








% 






OLEl 


1-95 


21.92 


41,70 


4.66 


29.77 


71.47 


RATIE 


1.40 


33.00 


28.77 


4.61 


32.21 


60.98 


RAT2E 


1.41 


30.94 


27,42 


4.73 


35.51 


62.93 


CEN 














OLEl 


1-16 


18.38 


34.05 


9.10 


37.32 


71.36 


RATIC 


5.58 


40.67 


21.22 


8.70 


23.82 


45.04 


RAT2C 


7.23 


39-71 


15.83 


10.89 


26.35 


42-18 



those strains may reflect a preference of the rat A9 enzyme 
for the 18:0-CoA substrate over 16:0-CoA, 

Although a yeast-rat A9 desaturase fusion gene is capable 
of functionally replacing the native OLEl of S. cerevisiae 
when present on a high copy niimber plasmid, a more strin- 
gent test of the efficiency of the rat protein in yeast would be 
to examine cells transformed with a single copy of the fusion 
gene. Cells containing the chimeric gene on CEN plasmid 
YCp50 showed growth rates that are reduced approximately . 
65% relative to wild type (Fig. 6B). 

Similarly, the lipid composition of CEN plasmid-bearing 
yeast transformants differed markedly between those contain- 
ing the chimeric gene and those containing the cloned yeast 
gene (Table I). The relative UFA levels were reduced approx- 
imately 38% in cells containing the rat gene coding sequence 
and the compensatory relative increase in saturated fatty 
acids resulted in a doubling of the 16:0 content and increased 
14:0 levels, but no significant change in the level of 18:0. 
Thus, the yeast-rat A9 desaturase fusion gene can functionally 
replace the native OLEl of S. cerevisiae, although its action 
results in striking differences in cellular fatty acid composi- 
tions. 



In previous studies using gene disruption and lipid analyt- 
ical methods (12) we provided evidence suggesting that the 
OLEl gene encoded the yeast A9 fatty acid desaturase. The 
deduced OLEl amino acid sequence and physical comparisons 
of the yeast and rat liver proteins given here provide further 
proof that the OLEl locus contains the authentic structural 
gene for the desaturase. The aligned regions of consecutive 
identity between these two proteins from widely divergent 
sources suggests that they may represent conserved regions 
with similar function. The finding that the rat A9 fatty acid 
desaturase gene can complement OLEl in S. cerevisiae al- 
though the two proteins have only 36% identity suggests that 
there is conserved functional interaction among cytochrome 
65-mediated desaturase systems. Thus, ER-bound A9 enzymes 
from other organisms and possibly other cytochrome 6s-me- 
diated desaturases, such as the A12 and A15, may also func- 
tion in yeast. 

Acknowledgment— We wish to thank Philipp Strittmatter for plas- 
mids containing the rat steardyl-CoA desaturase gene. 
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Abstract 

The enzyme A^-desaturase is responsible for the conversion of linoleic acid (18:2) to y-linolenic acid 
(18:3y). A cyanobacterial gene encoding A^-desaturase was cloned by expression of a Synechocystis 
genomic cosmid library in Anabaena, a cyanobacterium lacking A^-desaturase. Expression of the Syn- 
echocystis A^-desaturase gene in Anabaena resulted in the accumulation of y-linolenic acid (GLA) and 
octadecatetraenoic acid (18:4). The predicted 359 amino acid sequence of the Synechocystis A^'-desaturase 
shares limited, but significant, sequence similarity with two other reported desaturases. Analysis of three 
overlapping cosmids revealed a A*^-desaturase gene linked to the A^-desaturase gene. Expression of 
Synechocystis A^- and A* ^-desaturases in Synechococcus, a cyanobacterium deficient in both desaturases, 
resulted in the production of linoleic acid and >'-linolenic acid. 



Introduction 

Appropriate control of lipid metabolism is critical 
to normal cellular and organismal function. In 
many instances, the number, position, and stere- 
ochemical orientation of carbon icarbon double 
bonds is critical to the biological activity of cer- 
tain fatty acids. For example, there is consider- 
able interest in the polyunsaturated Cj^ fatty 
acids: a-linolenic acid (18:3^^* '^) and 7-lino- 



lenic acid (GLA; 18:3^^'^' '^). GLA is the result 
of desaturation of linoleic acid (18:3^^'*^) cata- 
lyzed by the enzyme A^-desaturase. Consumption 
of vegetable oils rich in GLA may alleviate hy- 
percholesterolemia and other clinical disorders 
which correlate with susceptibility to coronary 
heart disease [3]. The therapeutic benefits of die- 
tary GLA may result from its being a precursor 
to arachidonic acid (20:4) and thus subsequently 
contributing to prostaglandin synthesis [26]. 



The nucleoiide sequence data reported will appear in the EMBL GenBank and DDBJ Nucleotide Sequence Databases under the 
accession number LI 1421. 
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Most plant seed oils are deficient in GLA; 
therefore, we investigated the feasibility of obtain- 
ing a A^'-desaturase gene from a heterologous 
source which could, in turn, be used to trans- 
form plants to obtain seed oils containing GLA. 
The unicellular cyanobacterium Synechocystfs 
PCC 6803 was chosen as a source for the A^' 
desaturase because Synechocysiis accumulates 
GLA to a level greater than 20% of the total fatty 
acid mass [ 16] and because the lipid composition 
of cyanobacteria is similar to that of higher plant 
chloroplasts [17]. Furthermore, unlike other 
prokaryotes, cyanobacteria have aerobic desatu- 
rases which make them good models for under- 
standing lipid metabolism in higher plants [25]. 

With the exception of plant A^-stearoyl acyl 
carrier protein desaturases, cyanobacteria!, fun- 
gal, plant and animal desaturases are integral 
membrane proteins, a property that makes them 
difficult to purify and subsequently clone and 
characterize [1, 18, 21, 22, 24]. Therefore, 
we developed a molecular genetic strategy to iso- 
late a A^-desaturase gene from Synechocysfis 
PCC 6803. A Synechocysiis cos mid library was 
constructed and conjugated into wild-type Ana- 
baena PCC 7120, a cyanobacterium deficient in 
A^-desaturase, to identify gain-of-function Afw- 
baena transconjugants that produce GLA and 
therefore contain a functional Synechocystis A*^- 
desaturase gene. With this approach, we cloned 
a A^-desaturase gene from Synechocystis and ver- 
ified its expression in another cyanobacterium, 
Synechococcus PCC 7942. 

Materials and methods 

Strains and culture conditions 

Synechocystis PCC 6803 was obtained from the 
American Type Culture Collection. Anahaena 
PCC 7 120 and Synechococcus PCC 7942 were 
kindly provided by Dr James Golden and Dr 
Susan Golden, respectively (Department of Biol- 
ogy, Texas A&M University). These strains were 
grown phoioautoiropically at 30 "C in BG-11 
medium [19] under illumination of incandescent 



lamps (60 /iE m~~ s~ Cosmids and plasniids 
were selected and propagated in Escherichia coli 
strain DHScc on LB medium supplemented with 
antibiotics at standard concentrations [15]. 

Construction of Synechocystis cosniid genomic li- 
brary 

Total genomic DNA from Synechocystis 
PCC 6803 was partially digested with Sau3A \ and 
fractionated on a sucrose gradient [2]. Fractions 
containing 30 to 40 kb DNA fragments were se- 
lected and ligated into the dephosphorylated 
Bam HI site of the cosmid vector, pDUCA7 
[4]. The ligated DNA was packaged in vitro 
[2], and packaged phage were propagated in 
ZT. coli DHSxmcr' containing the helper plasmid, 
pRL528 encoding Ava I and £. coli 471 1 methy- 
lases [10]. A total of 1152 colonies were ran- 
domly isolated and individually maintained in 
twelve 96 well microliter plates. 

Conjugation of Synechocystis cosniid library into 
Anabaena 

Anabaena cells were grown to mid-log phase in 
BG-11 liquid medium, washed and resuspended 
in the same medium to a final concentration of ca. 
2x 10^ cells per ml. A mid-log phase culture of 
£. coli containing the RP4 plasmid [5, 10] grown 
in LB containing 50 fig ampicillin per ml was 
washed and resuspended in fresh LB medium. 
Anabaena cells were then mixed with E. coli con- 
taining RP4 and spread evenly on BG-11 plates 
containing 5% LB. The cosmid genomic library 
was replica plated onto LB plates containing 
50 fig kanamycin and 17.5/fg chloramphenicol 
per ml and was subsequently patched onto BG- 1 1 
plates containing Anabaena and E. coli carrying 
the RP4 plasmid. .After 24 h of incubation at 
30 "C, neomycin was underlaid to a final con- 
centration of 30 /ig/nil and incubation at 30 "C 
was continued until transcon juuants appeared 
[10]. 
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Fatty acid analysis 

Wild-type and transgenic cyanobacterial cultures 
were grown as described [19], harvested by cen- 
trifugation, and washed twice with distilled water. 
Fatty acid methyl esters were prepared from these 
cultures [8] and were analyzed by gas-liquid 
chromatography (GLC) using a Tracor-560 
equipped with a hydrogen flame ionization detec- 
tor and capillary column (30 m x 0.25 mm 
bonded FSOT Superox II; AHtech Associates, 
IL), Retention times and co-chromatography of 
|tandards (obtained from Sigma Chemical Co.) 
^ere used for identification of fatty acids. The 
average fatty acid composition was determined as 
the ratio of peak area of each C,8 fatty acid nor- 
malized to C,7;o internal standard. 



DNA sequence analysis 

Standard molecular biology techniques were per- 
formed as described [2, 15]. Dideoxy sequencing 
[20] of pBSl.8 was performed with Sequenase 
(United States Biochemical) on both strands 
using specific oligonucleotide primers synthesized 
by the Advanced DNA Technologies Laboratory 
(Biology Department, Texas A&M University). 
DNA sequence analysis was done with the GCG 
(Madison, WI) software [9]. 



Results 

.Gain-of'function expression of GLA in Anabaena 

Anabaena PCC 7120, a filamentous cyanobacte- 
rium, is deficient in GLA but contains significant 
amounts of linoleic acid, the precursor for GLA 
(Fig. 2; Table 1). A Synechocystis cosmid library 
was conjugated mio Anabaena PCC 7 120 to iden- 
tify transconjugants that produce GLA. Individ- 
ual transconjugants were isolated after conjuga- 
tion and grown in 2 ml BGl IN ^ liquid medium 
with \5 ixg neomycin per ml. Fatty acid methyl 
esters were prepared from cultures containing 
pools of ten transconjugants and analyzed by 



GLC; representative GLC profiles are shown in 
Fig. 2. Two pools (of 25 pools representing 
250 transconjugants) were identified that pro- 
duced GLA. Individual transconjugants of each 
GLA positive pool were analyzed for GLA pro- 
duction; two independent transconjugants, AS13 
and AS75, one from each pool, were identified 
that expressed significant levels of GLA and 
which contained cosmids, cSyl3 and cSy75, re- 
spectively (Fig. 1). These cosmids overlap in a 
region approximately 7.5 kb in length. A 3.5 kb 
Nhe I fragment of cSy75 was recloned in the vec- 
tor pDUCA7 to create pSy75-3.5 and transferred 
to Anabaena resulting in gain-of-function expres- 
sion of GLA (Table 1). 

Two Nhel/HindlU subfragments (1.8 and 
1.7 kb) of the 3.5 kb Nhe I fragment of pSy75-3.5 
were subcloned into pBluescript (Fig. 1) for se- 
quencing. Subsequently, both subfragments were 
transferred into a conjugal expression vector, 
pAM542 (T.S. Ramasubramanian and J. Golden, 
personal communication), in both forward and 
reverse orientations with respect to a cyanobac- 
terial rbcLS promoter and were introduced into 
Anabaena by conjugation. Transconjugants con- 
taining the 1.8 kb fragment in the forward orien- 
tation (pAM542- 1 .8F) produced significant quan- 
tities of GLA and octadecatetraenoic acid (Fig. 2; 



cSy7 — LH^^^^^— 
17 kb 




/ \ 1? 1.7 kb \r 



pSy 75-3.5 




pBSA12 



y 1.7 kb 1/ XT' ].8kb 



pBSl.7 pBSl.8 

/. Maps of cosmid cSy75. cSyl3 aiid cSy7 with overlap- 
ping regions and subclones. The origin of subclones of cSy75, 
pSy75-3.5 and cSy7 are indicated by the dashed diagonal 
lines. Restriction sites that have been inactivated are in pa- 
renthesis. 
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Table L Composition of Cjg fatty acids in wild-type and transgenic cyanobacteria. 



Strain 



Fatty acid (%) 



18:0 



18:1 



18:2 



18:3(3c) 



i8:3(y) 



18:4 



Wild type 

Synechocystis sp. ?CC e%01> 13.6 4.5 54.5 0 

Anabaena ^p. 9CC1\2{) 2.9 24.8 37.1 35.2 

Synechococcus sp. PCC 7942 20.6 79.4 0 0 



27.3 
0 
0 



Anabaeita tramconjugants 
cSy75 
pSy75-3.5 
pAM542-1.8F 
pAM542-1.8R 
pAM542-1.7F 
pAM542-1.7R 



3.8 
4.3 
4.2 
7.7 
2.8 
2.8 



24.4 
27.6 
13.9 
23.1 
27.8 
25.4 



22.3 
18.1 
12.1 
38.4 
36.1 
42.3 



9.1 
3.2 
19.1 

30.8 
33.3 
29.6 



27.9 
40,4 
25.4 

0 

0 

0 



12.5 

6.4 
25.4 

0 

0 

0 



Synechococcus transfomianis 
pAM854 
pAM854-A12 
pAM854-A6 
pAM854-A6 & A12 



27.8 
4.0 
18.2 
42.7 



72,2 
43.2 
81.8 
25.3 



0 
46.0 
0 

19.5 



0 
0 
0 

16.5 



-0 
0 
0 
0 



18:0, stearic acid; 18:1, 
acid. 



oleic acid; 18:2, linoleic acid; 18:3(3£), a-linolenic acid; 18:3(-/), y-Iinolenic acid; 18:4. octadecatetraenoic 



Table 1). Transconjugants containing other con- 
structs, either reverse oriented 1,8 kb fragment or 
forward and reverse oriented 1.7 kb fragment, did 
not produce detectable levels of GLA (Table 1). 

Figure 2 compares the Cjg fatty acid profile of 
an extract from wild iypQ Anabaena (Fig. 2 A) with 
that of transgenic Anabaena containing the 1 .8 kb 
fragment of pSy75-3.5 in the forward orientation 
(Fig. 2B). GLC analysis of fatty acid methyl es- 
ters from pAM542-L8F revealed a peak with a 
retention time identical to that of an authentic 
GLA standard. Analysis of this peak by gas 
chromatography-mass spectrometry (GC-MS) 
confirmed that it had the same mass fragmenta- 
tion pattern as a GLA reference sample (data not 
shown). 



Two genes involved in C^fi fatty acid biosynthesis are 
linked 

We isolated a third cosmid, cSy7 containing a 
A'^-desaturase gene by screening the Synechocys- 



tis genomic library with an oligonucleotide syn- 
thesized from the published Synechocystis A**^- 
desaturase gene sequence [25]. We identified a 
\J Vto Ava\ fragment from this cosmid contain- 
ing the A*^-desaturase gene and subcloned it into 
pBluescript to create pBSA12 (Fig. 1). We then 
used this probe to demonstrate that cSyl3 not 
only contains a A^-desaturase gene but also a 
A^^-desaturase gene (Fig. 1). Genomic filter hy- 
bridizations further showed that both the A^- and 
A^^-desaturase genes are unique in the Syn- 
echocystis genome indicating that two functional 
genes involved in Cig fatty acid desaturation are 
linked in the Synechocystis genome. 

Sequence analysis and comparison with other de- 
saturases 

The nucleotide sequence of the 1.8 kb fragment of 
pSy75-3.5 including the functional A'^-desaturase 
gene was determined. An open reading frame en- 
coding a polypeptide of 359 amino acids was 



9 



297 




AAGCTTCACrrcCXnTTTATATTGTGACCATOCriTCCX^GGCATCr^^ - 24 1 

TTrrarOCTOCCTTTAGAGAGTATTTTCTCCAAGTCa -181 

AAATC ATATACACy^CTATCCCAATATTGCCACAOCTTTGATGACrCACTCrrAG^ - 1 2 1 

ACTAAAATTCTAGCAATCGACTCCCAGTTGGAATAAATTTTTAGTCTCCrC^^ -61 

A G ' rm - lM - A Xn-AGTTAATCaXrGCTATAATGTGAAA G 'r r ri-ITATCTATTTAAA -1 



Retention lune 

Fig. 2. GLC analysis of fatty acids of wild-type and trans- 
genic Anabaena. Cjg fatty acid methyl esters are shouTi. 
A. Anabaena wild type (arrow indicates migration time of 
GLA), B. Transconjugant oi Anabaena with pAM542-1.8F. 
GLA, y-linolenic acid; 18:4, octadecatetraenoic acid. Peaks 
were identified by comparing the elution times with known 
standards of fatty acid methyl esters and were confirmed by 
GC-MS. 



identified (Fig. 3). It shares limited, but signifi- 
cant, amino acid sequence similarity with A^^- 
desaturase from Brassica napus [1] and A^^- 
desaturase [25]. A Kyte-Dooiittie hydropathy 
analysis [14] identified two regions of hydropho- 
bic amino acids that could represent transmem- 
brane domains (Fig. 4A); furthermore, the hydro- 
pathic profile of the A^-desaturase is similar to 
that of the Synechocystis A'^^-desaturase gene 
(Fig. 4B; [25]), A^-desaturase (not shown [23]) 
and A*^-desaturase (not shown [1]), 
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CATTGCCTTGGGATTGAAGCAAAATGGCAAAATCCCTCGTAAATCTATGATCGAAGCCT 113 9 

TTCTGTTGCCCOCCGACCAAATCCCCGATGCTGACCAAAGCTTGATGTTOGCATTGCTC 124 8 

CAAACCCACTTTGAOGGGCTTCATTOGCCGCAGTTTCAAGCTGACCTAGGAGCCAAAGA 1307 

TTGOGTGATTrrCCTC AAATCCGCTGGGAT ATTGAAAGOCITCACC ACCTTTGGTTTCT 1366 

ACCCItXrrCAATGGGAAOGACAAACCGTCAGAATTGTTTATTCTGGTGACACCATCACC 1425 

CCTACCGATTTTTG*OC*TTTTTGCCAAGGAATTCrATOCOCACrATCTCCATC0CACT 1602 
CCCCCGCCTCTACAAAATTTTATCCATCAGCTAGC 1637 

Fig. 3. Nucleotide and predicted amino acid sequences of the 
Synechocystis A^-desaturase. Amino acid residues are num- 
bered on the left; nucleotide positions are numbered on the 
right. 



Transformation of Synechococcus with arid 
A^^-desaturase genes 

The unicellular cyanobacterium Synechococcus 
PCC 7942 is deficient in both linoleic acid and 
GLA [16]. We cloned A*^ and A^-desaturase 
genes individually and together into pAM854 [6], 
a shuttle vector that contains sequences neces- 



sary for the integration of foreign DNA into the 
genome of Synechococcus [11]. Synechococcus 
was transformed with these gene constructs and 
colonies were selected [6]. Fatty acid methyl es- 
ters were prepared from transgenic Synechococcus 
and analyzed by GLC. 

Table 1 shows that the principal C|« fatty acids 
of wild-type Synechococcus are stearic acid (18:0) 
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Fig. 4. Hydropathy profiles of (A) A^- and (B) A'--desaturase 
from Synechocystis. The Kyte and Doolitlle algorithm in the 
GCG sequence analysis software was used to predict relative 
hydrophobicity in the predicted polypeptides of A*-desaturase 
and A*^-desaturase [9], Putative membrane-spanning regions 
are indicated by solid bars. 

and oleic acid (18:1). Synechococcus transformed 
with pAM854-A12 expressed linoleic acid (18:2) 
in addition to the principal fatty acids. Transfor- 
mants with pAM854-A6&A12 produced both H- 
noleate and GLA (Table 1). These results indi- 
cate that Synechococcus containing both A*^- and 
A^-desaturase genes has gained the capability of 
introducing a second double bond at the A'^ po- 
sition and a third double bond at the A^ position 
of Ci8 fatty acids. However, no changes in fatly 
acid composition were observed in the transfor- 
mant containing pAM854-A6 indicating that in 
the absence of substrate synthesized by the A'*^' 
desaturase, the A^'-desaturase is inactive; whether 
A^-desaturase requires two double bonds or only 
a double bond at the A'- position is not clear. 



This experiment further confirms that the 1,8 kb 
Nhe l/Hind III fragment (Fig. 1) contains both 
coding and promoter regions of the Synechocystis 
A^-desaturase gene. 



Discussion 

We used a gain-of-function approach to identify 
a cyanobacterial gene encoding an enzyme in- 
volved in fatty acid metabolism. The enzyme 
A^-desaturase is required for the conversion of 
linoleic acid (18:2"^'^'-) to y-linolenic acid 
^jg.3A6,9, i2>^ or GLA. Conjugation of a Syn- 
echocystis PCC 6803 cosmid library into the fila- 
mentous cyanobacterium Anabaena, which lacks 
GLA but does contain linoleic acid, the precur- 
sor to GLA, resulted in the gain-of-function ex- 
pression of GLA and ocladecatelraenoic acid. 
The ubiquitous presence of octadecatetraenoic 
acid (18:4^''- ^••^■'^) in GLA producing trans- 
genic Anabaeua provides additional insight into 
the Cj^ desaturation pathway. This unusual fatty 
acid, which is present normally in fish oils and in 
some plant species of the Boraginaceae family 
[12, 13] must result from the further desaturation 
of a-linolenic acid by a A^'-desaturase or desatu- 
ration of GLA by a A* ^-desaturase. We further 
demonstrated that a 1.8 kb region of the Syn- 
echocystis genome contains both coding and pro- 
moter regions of the Synechocystis 2A^-desaturase 
gene and is sufficient to produce GLA in Ana- 
baena and Synechococcus^ although in the latter 
case only when a second Synechocystis gene en- 
coding A'^-desaturase is also present to generate 
linoleic acid. 

The success of the gain-of-function approach 
described here, coupled with other molecular ge- 
netics tools now available in cyanobacteria, 
makes possible the identification of other cyano- 
bacterial genes for which there is no selectable 
phenotype. Certainly, other genes involved in lipid 
metabolism are prime candidates; the triad of 
Synechocystis, Anahaena and Synechococcus pro- 
vide an opportunity to isolate most genes involved 
in fatty acid metabolism in cyanobacteria. As a 
consequence, genes encoding desaturases for the 
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entire Cjg fatty acid desaturation pathway soon 
will be available for study. This will facilitate anal- 
ysis of the factors regulating levels of fatty acid 
desaturation and the role of these desaturation 
levels in overall cellular physiology, including 
chilling tolerance [25]. It is noteworthy that 
transgenic Anabaena and Synechococcus with al- 
tered levels of polyunsaturated fatty acids were 
similar to wild type in growth rate and morphol- 
ogy (data not shown) when grown under stan- 
dard conditions; effects of lower temperatures 
were not examined. The availability of these genes 
also will allow detailed structure/function analy- 
sis of this class of desaturases and will provide 
Irurther insight into the evolutionary constraints 
on protein structure and function. 

Recently, transgenic tobacco plants were pro- 
duced containing a chimeric cyanobacterial de- 
saturase gene, comprised of the Synechocystis A^- 
desaturase gene fused to sequences encoding a 
carrot extensin signal peptide [7] and an endo- 
plasmic reticulum retention sequence (KDEL); 
expression of this chimeric gene was driven by a 
CaMV 35 S promoter. These transgenic plants 
accumulated small but significant amounts of 
GLA (A.S. Reddy and T.L. Thomas, unpublished 
results). These results suggest that cyanobacterial 
genes involved in fatty acid metabolism can be 
used to generate transgenic plants with altered 
fatty acid compositions. These modifications 
could lead to improved nutritional characteristics 
^r increased industrial value of seed oils or im- 
proved growth potential of crop plants. In addi- 
tion, analysis of desaturase expression in a higher 
plant context may provide insight into the relative 
role of desaturases in the chloroplast, pre- 
sumably the more natural context of cyanobac- 
terial desaturases, vis a vis the endoplasmic retic- 
ulum. 
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Summary 

A cauliflower mosaic virus (CaMV) 35S promoter 
d rivative, which is tightly repressed by the TnIO en- 
c d d Tet repressor in a transient expression system 
as w 11 as In transgenic plants has been constructed. 
After tr atment of transgenic plants with tetracycline 
(Tc) th activity of the reporter enzyme p-glucuroni- 
das (GUS) increased up to 500-fold in tissue culture 
as w II as under greenhouse conditions. Efficient de- 
r presslon was achieved by Tc uptake through the ' 
r ots as well as by Tc treatment of leaves of intact 
plants. As Tc is not very stable in the plants, this 
system can also be used for a transient expression of a 
transgene. This system provides a unique tool for 
r g n rating transgenic plants carrying a repressed 
transg ne and for efficiently de-repressing its activity 
by a sp cific inducer at any time point of further 
d V I pment 



intr duction 

The ability to introduce foreign genes into the plant 
genome has provided the methodology to analyse the 
molecular mechanisms leading to co-ordinated expres- 
sion of genes in transgenic plants (ScheH, 1987). It also 
serves to express alien gene products or to modulate the 
expression of endogenous proteins (Sonnewald et aL, 
1991). Especially the last option opens new avenues for 
analysing and understanding the contribution of a defined 
gene to the organism's phenotype (Berg, 1991). When 
using this approach, a regulated promoter is often desir- 
able In order to induce expression at defined time points 
during development, or only in certain parts of a trans- 
genic plant. In addition, a tightly repressed promoter is 
absolutely required If the expression of a certain gene 
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product of interest Interferes with the regeneration 
process. 

A number of plant promoters regulated by light 
(Kuhlemeirefa/., 1987). heat (Ainley and Key, 1990), stress 
(Freeling and Bennett, 1985), or wounding (Keil et aL, 
1989) are available for the controlled expression of a 
transgene. However, they suffer from the disadvantage 
that the inducing conditions influence a variety of re- 
sponses in the plants. Therefore we have developed a 
tightly repressed, specifically de-repressible promoter by 
a suitable combination of bacterial control elements with a 
strong, normally constitutive plant promoter. 

We have reported previously that the Tn 10 encoded Tet 
repressor can regulate the expression of a modified CaMV 
358 promoter in transgenic tobacco plants (Gatz et al„ 
1991). In principle, we generated a transgenic plant which 
constitutively synthesizes the bacterial repressor protein 
(tetR-^). Two binding sites for the Tet repressor, the 1 9 bp 
palindromic tet operators, were introduced downstream 
of the TATA-box of the nomnally constitutive CaMV 35S 
promoter. When stably integrated into the genome of the 
tetR-^ plant, only low levels of activity from this modified 
promoter were detected. An 80-foid increase in RNA 
levels was achieved after 0.5 h upon vacuum infiltration of 
single leaves with a buffer containing the inducer tetra- 
cyline (Tc, 0.1 mg \-\ which prevents the repressor from 
binding to its operator sequences. Since then we have 
significantly improved the system by a further reduction of 
the expression in the uninduced stage using a different 
an-angement of the tet operators, within the promoter. 
Moreover, we describe the effect of a variety of Tc applica- 
tion procedures, as well as the kinetics of induction In 
whole plants and the time course of the decline of the 
amount of GUS mRNA after omission of the Tc treatment. 

Results and discussion 

Combination of three tet operators with the CaMV35S 
promoter (Thpie-Op'-promoter) 

It has been proposed by Lin and Riggs that repression 
efficiencies increase with the number of operators within a 
promoter, if each copy by itself contributes to repression 
(Lin and Riggs, 1975). In two previous studies we have 
investigated the influence of single Tet represser-operator 
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Figure 1. Schematic overview of gel shift analysis, transient expression analysis and Northern blot analysis of the Tripfe-Op' promoter. 

(a) Region -90 to +17 of the Triple-Op'-promoter. Each square represents 1 bp. The sequence of actlvati 1989) is boxed as 
/K? .^f l^® ^^"^ operators (ACTCTATCACTGATAGAGT) and the TATA-box (TATATAA). The arrow marks the start site of transcription (Odell et al 1985) 

(b) Mobthty shift expenment demonstrating simultaneous occupation of the three operator sites within the Triple-Op'-promoter. One femtcmole of a 350 bp 
fragment containing the complete Triple-Op'-promoter was incubated with increasing amounts of Tet repressor purified to homogeneity from E. co// (qift of 
Drs Altschmred and Hillen). Numbers below the lanes indicate the amount (fmol) of Tet repressor added. 

(tj Cat activity in transiently transfomied tobacco protoplasts which constitutlvely express the repressor gene (Gatz ef a/.. 1991). Ten micrograms of pTriple- 
Op-Cat (lanes 1-6) or pTET7 (lanes 7 and 9; Gatz and Quail, 1 988) were introduced into tobacco protoplasts using polyethylene glycol mediated gene transfer 
and^incubated with (+) or without (-) Tc. Cat activity shown In lane 9 represents background levels when protoplasts were transfomied with herring sperm 

(d) Northern blot analysis of Tc treated transgenic plants containing the chimeric Triple-Op'-promoter-GUS-gene Oane 1) or the wild-type CaMV 35S 
promoter-GUS-gene (lane 2). Rehybridization of the blot with the probe for the ribosomal gene S4 was done to show that equal amounts of RNA were loaded 
RNA from the three highest expressing plants of each transformation was combined for this analysis. 



complexes in different positions on the expression of the 
CaMV 35S promoter (Frohberg et aL, 1991; Heins et al., 
1992). If located upstream of the TATA-box. efficient re- 
pression was only observed when the operator was 
located less than 3 bp away from the TATA-box. Down- 
stream of the TATA-box the promoter was stringently 
repressed when the distance between the operator and 
the TATA-box was not more than 31 bp. In consideration 
of these data we constructed the so called Triple Op' 
promoter, which contained one operator (01) 1 bp up- 
stream of the TATA-box, a second operator (02) 1 bp 
downstream of the TATA-box and a third operator (03) 
23 bp downstream of the TATA-box (Figure la). In Figure 
1b we demonstrate that all three operators within this 
promoter fragment can simultaneously be occupied by the 
Tet repressor protein, though the spacing of 9 bp between 
01 and 02 and the spacing of 2 bp between 02 and 03 is 
less than in the wild-type arrangement of 1 1 bp found 
between the two operator sites in the Tn70 encoded 
regulatory region (Hillen et aL, 1984). With limiting 
amounts of Tet repressor four different bands can be 



obsen/ed In a mobility shift assay: free DNA. DNA bound to 
one repressor dimer, DNA bound to two repressor dimers 
and a fourth complex representing a fully saturated 
operator fragment. Next we analysed, in a transient 
expression system using chloramphenicol acetyl trans- 
ferase (Cat) as a reporter enzyme (An, 1987), if the com- 
bination of the CaMV 35S promoter with three perfectly 
palindromic operator sequences affected promoter 
strength. As shown in Figure 1 c no significant difference in 
gene expression was observed when comparing the 
Triple-Op'-promoter in the de-repressed stage with the 
wild-type CaMV 35S promoter. In the absence of the 
inducer, no detectable promoter activity was observed in 
tetR'^ protoplasts synthesizing the Tet repressor, indicat- 
ing stringent repression. For the analysis of the promoter 
in transgenic plants, it was fused to the p-glucuronidase 
(GUS) gene (Jefferson ef a/., 1987) and transfen^ed to the 
genome of a tetR^ plant using Agrobactehum tume- 
faciens mediated gene transfer. Leaves from 20 hygro- 
mycine resistant regenerated shoots were treated with Tc 
by vacuum infiltration (Gatz ef a/.. 1991) and GUS activity 
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was determined. All 20 plants showed no GUS activity 
before treatment with Tc and in every case a strong 
increase in GUS activity after Tc treatment. For control 
purposes a chimeric gene consisting of the wild-type 
CaMV 35S promoter (Covey and Hull, 1 985) and the GUS 
gene was transferred to tobacco plants. Due. to the varia- 
tion of the expression levels of transgenes (Sanders etaL, 
1987) we combined RNA from the three highest express- 
ing plants from each transformation and subjected them 
to Northern blot analysis. As shown in Figure 1d, no 
difference in gene expression was observed, indicating 
that maximal wild-type promoter activities can be reached 
with the Triple Op'-promoter in the de-repressed stage. 

Quantitation of the repression efficiency of the Triple- 
Op'-promoter in transgenic tobacco plants 

At the level of GUS enzyme activity we consistently ob- 
served a 50-fold increase of expression after infiltration of 
single leaves with Tc. which is 10-fold more than we 
observed with one of our previous constructs which con- 
tains only two operators downstream of the TATA-box 
(Gatz et aL, 1991). Because all plants showed the same 
pattem of Tc dependent gene expression, we kept 10 of 
the highest expressing plants and randomly picked one of 
those for the various forms of analysis shown below. When 
analysing 30 jxg total RNA on a Northern blot, we could not 
detect any GUS RNA in the repressed stage, with longer 
exposures yielding only background signals from cross- 
hybridizing ribosomal RNAs (data not shown). In order to 
quantitate the repression efficiency at the RNA level, we 
analysed poly(A)'*' RNA from repressed and de-repressed 
leaves of one of the 10 highest expressing transgenic 
plants. The amount of GUS mRNA in untreated and Tc- 
treated leaves was compared by using a dilution series of 
the signal obtained from Tc-treated leaves with mRNA 
from untransformed plants as a concentration standard. 
mRNA (800 ng) obtained from untreated and Tc treated 
leaves was loaded on a gel as well as a mixture (total 
amount: 800 ng) of poly(A)"^ RNA prepared from untrans- 
formed tobacco plants with 40, 26, 8, and 4 ng of the 
mRNA from Tc-treated plants. The signal obtained after 
rehybridlzation of the blot with a probe from the ribosomal 
gene S4 (Devi et aL, 1989) was used to standardize the 
amount of mRNA loaded. Taking into account that about 
twofold more poly(A)*** RNA was present in the lane con- 
taining mRNA from untreated leaves, we judged from the 
dilution series shown in Figure 2, that the activity of the 
Triple-Op'-promoter is repressed at least 100-fold at the 
RNA level. 

De-repression of the 'Triple-Op' promoter in whole plants 

With the repression efficiency being tight enough to ob- 
serve significantly different levels of GUS activity in the 
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Figure 2. Northern blot analysis of poly(^'^ RNA from a transgenic plant 
containing a chimeric repressor gene and a Triple-Op' -GUS-gene. 
Poly{A)'^ RNA from an untreated and a Tc treated tobacco plant was 
analysed, as well as different amounts of poly(A)"^ RNA from the Tc treated 
plant mixed with mRNA from untransformed tobacco W38. Tc treatment 
was performed by vacuum Infiltration of single leaves with 1 mg 1"^ Tc and ^ 
RNA was extracted after 2 days. Lane (-): 800 ng RNA from an untreated 
plant; lane (+): 800 ng RNA from a Tc-treated plant (TcRNA); lane (5%): 760 
ng W38 RNA +40 ng TcRNA; lane (2%): 784 ng W38 RNA + 16 ng TcRNA; 
lane(1%): 792 ng W38 RNA + 8 ng TcRNA; lane (0.5%): 796 ngW38 RNA + 
4 ng TcRNA. The blot was hybridized first with a GUS probe and afterwards 
with a S4 prot>e. 



repressed versus the de-repressed stage we started 
characterizing different modes of Tc application by doing 
in-situ stainings of whole plants with X-Gluc (Jefferson. 
1987). First, we cultivated shoots of one transgenic plant 
on 2MS-medium supplemented with 1 mg 1"^ Tc. 
Tobacco W38 forms roots without delay when grown In 
the presence of this amount of Tc. As shown in Figure 3a, 
no GUS activity was obsen/ed in the cutting that was 
grown without Tc even after an ovemight incubation in X- 
Gluc. When grown on Tc-containing medium, however, 
dark blue staining representing high GUS activity was 
observed in the roots, the two lower leaves which had 
been in contact with the medium, and around the vascular 
tissue in some of the upper leaves. A leaf from a different 
cutting, which had only partly touched the medium 
showed staining in this region, and again around the 
vascular tissue. This result indicates that Tc is taken up 
through the roots and transported throughout the plant, 
and that it can also be taken up directly through the leaf. If 
we let a plant grow on 2MS medium without Tc and place 
one of the leaves between two separate blocks of agar 
containing Tc we observe after 6 h a local induction within 
this leaf. If we extend this treatment for 3 days we observe 
GUS activity in the lower and upper leaves as well as in the 
roots, which might indicate that Tc is transported through 
the phloem. 

In order to achieve homogeneous distribution of the 
antibiotic throughout the plant we removed the lids of our 
tissue culture containers once a day for 15 min under 
sterile conditions thus enhancing transpiration. In addition 
we placed a piece of sterile cheesecloth between the lid 
and the container which also served to increase transpira- 
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Rgure 3. Localization of GUS enzyme activity after growing two cuttings of one transgenic plant on 2MS medium (a) which was supplemented with 1 mg 
Tc (b.c). 

The dark blue staining represents high levels of GUS enzyme activity. The part of the leaf in (c) that was stained intensely blue had touched the medium. Plants 
had been grown on Tc containing medium for 2 weeks. 



Hgiire4. Localization of GUS enzyme activity. 

(a) Localization of GUS enzyme activity after growing plantlets on vemniculite with Tc containing Hoagland buffer for 2 weeks. To enhance transpiration, the lid 
of the tissue culture container was removed under the hood for 1 5 min once a day. Eveiy 3 days. Hoagland buffer containing fresh Tc was added. 

(b) Direct comparison of an uninduced leaf with a leaf detached from the plant shown in (a). 
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Hgure 5. Localization of GUS enzyme activity after growing plantlets on vermiculite with Hoagiand buffer. 

Plantlets shown in (b) and (e) were submersed in a Tc containing buffer (1 mg r ^) for 1 5 min once a day for 2 weeks. Plant in (a) is an independent cutting of the 
plant in (b). but grown on 2MS without Tc. The leaf shown in (d) is from an independent cutting of the plant in (e) that had been kept on 2MS without Tc. The plant 
in (c) was transformed with a chimeric wild-type CaMV 35S promoter-GUS construct. 



tion in the growth chamber. As a second improvement we 
grew the plants on moist vermiculite, which had the ad- 
vantage that fresh Tc could be added without cutting off 
the roots. As shown in Figure 4, homogeneous staining 
was observed throughout the plant, indicating sufficient 
distribution of the antibiotic. As determined by the fluori- 
metric GUS assay the gene was induced 500-fold under 
these conditions. This indicated that the 50-fold induction 
that had been measured 2 days after infiltration of single 
leaves might have been an underestimation because max- 
imal levels of protein are not reached under these condi- 
tions. 

As a third way of induction under tissue culture condi- 
tions we piit whole plantlets Into a breaker containing a 
buffer with 1 mg Tc. This type of Tc application also led 
to a GUS staining pattem that was indistinguishable from 
that of tobacco plants transformed with the wild-type 35S 
promoter fused to the GUS gene (Figure 5). Thus the light 



blue staining in the upper leaf is a property of the CaMV 
35S promoter and not due to limited Tc uptake in younger 
leaves. In the long run, however, this mode of Tc applica- 
tion leads to some browning of the stem and the roots, so 
that we consider uptake through the roots as described 
above as more useful. Again, In this experiment, 500-fold 
induction at the level of GUS activity was observed. 

In plants grown under greenhouse conditions, the 
promoter was de-repressed by applying the antibiotic 
through the roots (Figure 6). Plants suffered when sprayed 
with Tc in the presence of Saprogenate and uptake was 
poor when Saprogenate was omitted (data not shown). 

Kinetics of de-repression and re-repression of the 
promoter 

We have followed the kinetics of de-repression by taking 
samples from a Tc-treated plant and assaying them for 
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Figure 6. Time course of de-repression after Tc uptake through the roots 
in plants adapted to greenhouse conditions. 

Arrows point to the 2 days where fresh Tc was added (1 mg r% Each day a 
leaf was harvested for analysis of GUS enzyme activity and mRNA abun- 
d£uice. 



GUS activity and RNA expression (Figure 6). As Tc uptake 
through roots had turned out to be the best way for de- 
repression of the promoter we used the following set-up 
for Tc treatment under greenhouse conditions. One of the 
transgenic plants, which contained six leaves at day 0, 
was cultivated with its roots hanging down into a beaker 
containing Hoagland buffer with fresh air being supplied 
through an aquarium pump. On day 0, Tc was added at a 
concentration of 1 mg , and the medium was exchanged 
only once on day 1 by a fresh batch of buffer, again 
containing 1 mg Tc. Each day a whole leaf was har- 
vested for analysis of GUS activity and RNA extraction. 
Whereas maximal amounts of RNA were already detected 
on day 2, maximal levels of GUS enzyme activity (40-fold 
de-repression in this experiment) were observed on day 4. 
As long as detectable amounts of GUS RNA were present 
GUS activity continued to accumulate. We cannot clearly 
state at this moment if this— with respect to the RNA— 
delayed accumulation of the gene product is typical for the 
GUS reporter system or if it is caused by the untranslated 
leader, which contains at its 5' end an almost complete 
palindromic operater sequence (Figure 1 a). The kinetics of 
appearance of a gene product has to be specifically 
determined for each individual gene product, whereas 
RNA induction can be assumed to be maximal within a few 
days after addition of the antibiotic. 

The antibiotic seems to become inactivated rather 
quickly in the plant because 2 days after the last addition 
of fresh Tc mRNA levels started to decrease and were 
indistinguishable from background levels after 4 days. 



This could be due to the light sensitivity of the antibiotic, 
because we do not see this effect when we incubate Tc 
treated leaves in the dark (Gatz et aL, 1991). One week 
after the last treatment with Tc, no GUS activity was 
detectable. This feature Is extremely useful for the tran- 
sient expression of a transgene, but for continuous 
de-repression fresh Tc has to be added at least every 
other day. 

In conclusion, we have constructed a tightly repressed 
plant promoter that can be de-repressed with very low 
amounts of Tc. Using GUS as a reporter system we have 
demonstrated that homogeneous de-repression can be 
achieved by Tc uptake through the roots or through 
leaves. In terms of repression efficiency and de-repress- 
ibility the system seems to be suitable for the controlled 
expression of any transgene. It remains to be investigated, 
however, if plants carrying genes, which are lethal when 
expressed, can be regenerated using this system. Though 
Tc is applied at concentrations where we did not observe 
any phenotypic effect or a reduction of expression of the 
S4 gene or the tetR gene under the control of the CaMV 
35S promoter (Gatz etaL, 1991), we are .going to develop 
non-antibiotic analogues as inducers. Studies on stnjc- 
tural requirements on the Tc-Tet repressor interaction has 
already indicated that the Tet repressor and ribo- 
somes recognize the drug in a different manner 
(Degenkoib eta/., 1991). In addition, the crystal structure 
of the Tet repressor -Tc complex Is currently being solved 
(Parge et at., 1984) so that detailed infomriation on the 
synthesis of a non-antibiotic inducer should be available in 
the near future (W. Hillen, personal communication). 



Experimental procedures 

Plants, bacterial strains and media 

Nicotiana tat)acum L. was obtained through Vereinigte Saat- 
zuchten' (Ebstorf, Germany). Plants in tissue culture were grown 
under a 16h light/8 h dark regime on Murashige and Skoog 
medium (Murashige and Skoog, 1962) containing 2% sucrose 
(2MS) or on vemiculite with Hoagland buffer. Escherichia coli 
strains DH5a (Bethesda Research Laboratories, Gaithersburg, 
USA) and WH207/pRT241 (WIssmann et ai, 1986) were cul- 
tivated using standard techniques (Sambrook et al., 1989). 
Agrobacterium tumefaciens strain C58C1 containing pGV2260 
(Debleare et ai, 1984) was cultivated in YEB medium (Vervliet 
etal., 1975). 



Reagents 

DNA restriction and modification enzymes were obtained from 
Boehringer Mannheim (Ingelheim, Germany) and New England 
Biolabs (Danvers, USA). Synthetic oligonucleotides were syn- 
thesized on an Applied Biosystems (Foster City, USA) DNA syn- 
thesizer (380A). Chemicals were obtained through Sigma Chemi- 
cal Co. (St Louis, USA) or Merck (Darmstadt, Gennany). 



Tetracycline-dependent plant promoter 403 



Recombinant DMA techniques 

Standard procedures were used for recombinant DNA work 
(Sambrook etal., 1989). 



Constructs 

First an oligonucleotide containing a Spel- (-53), a Sna bl (-32), a 
' Stu\' (-22), a Xbal (-16), a Xhol-(-3) and a BglW- (+2) site was 
insert d between the Hga l-site(-55) and the BglW site (+2) of 
pTET7 yielding plGF107 using the same strategy as described 
previously (Gatz and Quail, 1988). Two complementary oligo- 
nucleotides with cohesive Spel and BglW sites were synthe- 
sized: (oligonucleotide 1: CTAG-ACTCTATCAGTGATAGAGT-G- 
TATATAA-G-ACTCTATCAGTGATAGAGT-GA-ACTGTATCAGT 
GATACAGT-TAACGGTACCT, oligonucleotide 2: CTAGAGGTA 
CCGTTA - ACTCTATCACTGATAGAGT -TC - ACTCTATCACTGA 
TAGAGT-C-TTATATA-C-ACTCTATCACTGATAGAGT). This syn- 
thetic DNA fragment, which contained three operators, the CaMV 
35S TATA-box as well as an Hpal and an Asp718 site down- 
stream of the third operator was inserted into plGF107, cut with 
Spel and Xbal yielding pTrtple-Op-Cat. Recombinant clones 
were detected using the repressor titration system described by 
Wissmann etal. (1986). This modified promoter was cloned as a 
Sma\~Xba\ fragment in front of the GUS gene, using pAT3. The 
promoter in pAT3 was excised as an Asp71 Q-Xba\ fragment and 
was replaced by the Triple-Op' -promoter fragment after filling in 
the Asp7^6 site of pAT3. pAT3, which is a binary vector contain- 
ing a hygromycln resistance gene, was used to transform a tetR* 
transgenic plant via Agrobacterium tumefaciens mediated gene 
transfer (Rosahl et al. , 1 987). 



Binding studies with purified Tet repressor 

The Triple-Op'-promoter was excised as an EcoRI/Sjgf/ll- 
fragment, purified from vector DNA using the 'Gene Clean* kit 
from Dianova (Hamburg) and end-labelled by filling in the pro- 
truding ends in the presence of (a-^^PJdATP using Klenow 
polymerase. Binding reaction and gel electrophoresis were 
canied out as described previously (Gatz et al., 1991). 



Transient expression in tobacco protoplasts 

Isolation, transformation and chloramphenicol acetyl trans- 
ferase assays were essentially as described by Frohberg et al. 
(1991). 



Northern blot analysis 

Total RNA from leaves was prepared according to Logemann et 
al. (1987). Poly(/^^ RNA was prepared using the 'Dynabeads 
mRNA Purification Kit' from Dynal (Hamburg). Blotting and 
hybridization were carried out as described previously (Heyer and 
Gatz, 1991). 



Assays for GUS activity 

For the fluorometric GUS assay, explants were homogenized and 
incubated with the substrate 4-methylumbelliferyl-p-D-glucur- 
onide at 37*0. Quantification of the fluorescence was done 
according to Jefferson (1987) and Jefferson etal. (1987). Protein 
concentrations were determined according to Bradford (1979). 



For in-vivo staining, intact plant material was vacuum infitrated 
with 1 mM X-Gluc (5-bromo-4-chloro-3-indolyl-p-D-glucuronic 
acid cyclohexylammonium) and incubated ovemight at 37^0. 



Tobacco transformation 

Transformation of tobacco plants was carried out using the 
Agrobacterium tumefaciens leaf disc technique as described by 
Rosahl efa/. (1987). 



Application of tetracyline to the plants 

For Tc application under axenic conditions, plants were either 
grown on 2 MS medium with 1 mg 1"^ Tc, or on vermicutite in 
Hoagland buffer 1 mgr^ Tc. Altematively, plants were sut>- 
mersed once a day for 15 min in 1 mg Tc in 50 mM sodium 
citrate (pH 5.5). Aerial parts of plants adapted to greenhouse 
conditions were submersed once a day for 15 min in 1 mg 1"^ Tc, 
0.025% Saprogenate (Hoechst, Frankfurt) in 50 mM sodium 
citrate (pH 5.5). For Tc uptake through roots, plants were cul- 
tivated in a beaker containing Hoagland buffer and 1 mg r** Tc. 
Oxygen was supplied through an aquarium pump. 
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A detailed analysis of the expression of a chimeric 
gene, consisting of the upstream region of the nuclear 
photosynthetic gene ST-LSl, encoding a component of 
the water-oxidizing complex of photosystem II, fused to 
the coding sequence of /3-glucuronidase (GUS) as a 
reporter, is described. The expression of this chimeric 
gene at the cellular level was detected by histochemical 
methods and shows that the expression of this gene is 
correlated with the presence of chloroplasts. Interestingly, 
the GUS activity was not only detected in typical 
photosynthetic tissues, e.g. leaves and stems, but also in 
green roots containing chloroplasts. In contrast no activity 
was detected in neighbouring white root tissue which was 
devoid of chloroplasts. One can therefore separate the 
relative importance of the (morphological) differentiation 
steps responsible for the formation of tissues normally 
involved in photosynthesis, from the importance of the 
developmental stage (characterized by the presence of 
chloroplasts), for the expression of this nuclear photo- 
synthetic gene. Our data strongly suggest that the 
developmental stage of the plastids is the primary 
determinant for the activity of this nuclear photosynthetic 
gene, although they do not yet allow the exclusion of the 
reverse type of control, i.e. control of the differentiation 
of the plastid by the expression of certain nuclear genes. 
A chimeric gene, consisting of the promoter of the 35S 
cauliflower mosaic virus (CaMV) gene and the GUS 
coding sequence, was used as a control throughout the 
experiments, confirming that the observed differential 
ST-LSl — GUS gene expression reflects the particular 
transcriptional regulation impacted on this gene by its 
ci5-acting regulatory sequences. 

Key words: cell-specific expression/chioroplast dependent 
expression/photosynthetic gene/35S promoter 



introduction 

One important feature of eukaryotes is the fact that their 
genetic information is divided between two or, in the case 
of higher plants and algae, three different organelles— the 
nucleus, the mitochondria and the plastids. One important 
task for the cell is the coordination of the expression of genes 
present in these different cell compartments. This is of special 
importance in view of the fact that many plastidic and 
mitochondrial proteins are encoded by nuclear genes. Yeast 
mutants which are devoid of mitochondrial DNA, but 



nevertheless form organelles which structurally resemble 
mitochondria, are examples for the importance of the nuclear 
genome. 

The photosynthetic apparatus of higher plants consists of 
several large protein complexes. As these complexes are 
encoded by both nuclear and plastidic genes, the plant cell . 
therefore is faced with the problem of coordinating the 
expression of a large number of genes present in both com- 
partments. 

The molecular mechanisms which lead to this coordinated 
expression are unknown. In addition to light irradiation, 
which triggers the expression of several nuclear photo- 
synthetic genes (Tobin and Silverthorne, 1985), the 
developmental stage of the cell is also important for their 
expression. In maUire plants these genes are highly expressed 
in leaf mesophyll cells, whereas under natural growth 
conditions no expression is detectable in, for example, roots. 

Several recent observations indicate that a 'plastidic factor' 
might be involved in the regulation of nuclear photosynthetic 
genes. It has been reported by OelmCiller and Mohr (1986) 
that the photo-oxidative damage of chloroplasts in mustard 
seedlings grown on a medium containing the herbicide 
Norfluorazon, leads to a severe reduction of the amount of 
translatable RNA, encoding the small subunit ribulose 
biphosphate carboxylase (RBCS) or the chlorophyll a/b 
binding protein (CAB). After a partial recovery of the 
chloroplasts, the amount of translatable mRNA increases 
again (Schuster et al., 1988). Similar effects have been 
observed for the accumulation of CAB mRNA in carotenoid 
deficient tissue of maize seedlings where the carotenoid 
deficiency was due either to a mutation or to treatment with 
a herbicide (Mayfield and Taylor, 1984). Chlorophyll 
deficient maize seedlings, however, which contain plastids 
arrested in a developmental stage prior to chloroplast 
formation, accumulate normal levels of CAB mRNA 
(Mayfield and Taylor, 1984). 

In the cases described above, the photo-oxidative damage 
of the chloroplasts did not affect the expression of several 
genes encoding cytoplasmic proteins (ReiB et aL, 1983; 
Mayfield and Taylor, 1984). These and other observations 
(Eckes et aL, 1985; Simpson et al., 1986; Borner, 1986; 
Stockhaus et al., 1987a) can be taken as indicative of a 
so-called ^plastidic factor', produced by the chloroplasts at 
a certain stage of development and which is essential for the 
expression of nuclear encoded chloroplastidic proteins. The 
observations summarized above are, however, hampered by 
the fact that all these data are based either on the use of 
inhibitors or of mutants leading to a photo-oxidative damage 
of the chloroplasts. With these experiments it is difficult 
to prove that the photo-oxidation will only influence the 
expression of the photosynthetic genes studied by the 
different authors and not result in any side effects. 
Furthermore these data are all based on the analysis of 
tissue homogenates. An analysis of the expression of these 
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Fig. 2. Histochemical localization of the GUS enzyme activity in leaves and stem of tobacco plants transformed with the ST-LSl— GUS or the 
35S-GUS gene. (A) Dark-field photograph of a transverse leaf section of a 35S-GUS plant. The dark blue staining represents high levels of GUS 
enzyme activity. Bright-field photographs of leaf epidermis of a ST-LSI -GUS plant (B) and a 35S-GUS plant (C); trichomes of a ST-LSl -GUS 
plant (D) and a 35S-GUS plant (E). Dark-field photographs of transverse stem sections gf a ST-LSl -GUS plant (F) and a 35S-GUS plant (G); 
longitudinal sections of the shoot apex of a ST-LSl— GUS plant (H) and a 35S-GUS plant (I), a, axillary bud: am. apical meristem: c. cortex 
parenchyma; e, epidermal cell; g, guard cell; p,.pith parenchyma; ph. phloem; pp. palisade parenchyma; sp, spongy parenchyma; tr. trichomes: 
V, vascular tissue; x, xylem. 



Expression pattern in non -photosynthetic organs, e.g. 
roots and tubers of transgenic potato plants 

In a second series of experiments we analysed the expression 
of the GUS fusions in organs characterized by the lack of 
chloroplasts under normal growth conditions. 

The histochemical analysis of potato tuber cross-sections 
demonstrates that the ST-LSl — GUS gene is not expressed 
in tubers under normal growth conditions. In tubers exposed 
to white light for a few days, however, weak ST-LSl —GUS 
gene expression is detectable in rudimentary leaves of 



sprouting green buds (see Figure 3 A) and the outer layer 
of chloroplast containing parenchymatic cortex cells (see 
Figure 3C). The 35S — GUS gene is expressed in paren- 
chymatic cells associated with the vascular tissue in the pith 
(see Figure 3D) and in germinating buds (see Figure 3B). 
There was no expression detectable in the starch containing 
parenchymatic cells in the pith and in the periderm tissue 
of the tuber. Using transversal sections of roots of transgenic 
potato plants grown in soil we detected no ST-LSl —GUS 
gene expression (see Figure 3E), whereas the 35S— GUS 
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gene is highly expressed in the parenchymatic tissue of the 
root (see Figure 3F). 

Roots of potato plants grown in tissue culture and which 
are therefore exposed to light do contain chloroplasts in 
parenchymatic celts (Eckes etai., 1985) (see Figure 3G). 
The redifferentiation . of these parenchymatic cells to 
chloroplast containing cells starts at a certain distance from 
the root tip. White side roots, growing out of older green 
roots (see Figure 3H), therefore represent a unique system 
allowing a direct comparison between ST-LSl -GUS gene 
expression in green roots, which contain chloroplasts, and 



young whitish roots which do not. 

In parenchymatic cells containing chloroplasts a strong 
GUS enzyme activity is detectable (see Figure 3J), whereas 
there is no GUS enzyme activity detectable in the young 
outgrowing roots (see Figure 31 and J). In whitish roots of 
tobacco plants grown in tissue culture exposed to. white light, 
we also observed chloroplasts by fluorescence microscopy^ 
though their number is much lower. In these tobacco roots 
the ST-LSl -GUS gene is expressed, albeit at a rather low 
level (data not shown). In contrast to this highly differential 
expression of the ST-LSl -GUS gene in correlation to the 
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Fig. 4. Hisiochemicai analysis of GUS enzyme activity in callus and suspension culture cells derived from potato plants transformed with the 
■ ST-LSI-GUS or the 35S-GUS gene. Brighl-field photograph of a ST-LSl-GUS callus (A) and 35S-GUS callus (B). Dark-field photogrdph of 
ST-LSl -GUS suspension culture cells (C) and 35S-GUS suspension culture cells (D). 



presence of chloroplasts, the 35S — GUS gene is expressed 
in white as well as in green parenchymatic root cells (data 
not shown). 

Expression in potato callus and suspension culture 
cells t 
As a final step in our analysis, the expression pattern of both 
genes in undifferentiated callus and suspension culture cells 
was determined. A weak expression of the ST-LSl — GUS 
gene was detected in green callus cells (see Figure 4 A). In 
callus cells representing a different developmental stage 
characterized by the lack of chloroplasts, no GUS activity 
was detected. The callus used for these experiments was 
derived from transgenic potato plants displaying high levels 
of GUS activity in leaves. The 35S-GUS gene is expressed 
to much higher levels in callus cells (see Figure 4B). 

In tobacco as well as potato suspension culture cells grown 
under heterotrophic conditions and devoid of chloroplasts 
we again did not detect any ST-LSl-GUS gene expression 
(see Figure 4C). This contrasts with the high expression of 
the 358 -GUS gene in these cells (see Figure 4D). 

Discussion 

The photosynthetic apparatus localized in the chloroplasts 
of higher plants contains protein complexes which are 
encoded by the nuclear and the plastidic genome. In view 
of the central importance of the photosynthetic activity for 
the survival oif the plant, it is obvious that the expression 



of the genes of both compartments must be interlinked and 
tightly controlled. Whereas post-transcriptional control 
appears to be especially important for the regulation of 
a number of plastidic genes (reviewed by Gruissem, 1989), 
the expression of nuclear photosynthetic genes appears to 
be regulated primarily at the transcriptional level. Light 
signal transducing systems in which phytochrome is involved 
play an essential role in this regulation (Tobin and 
Silverthorne, 1985). The coordinated expression of both 
nuclear and plastidic genes has, however, received less at- 
tention. 

The data described in the Results point to a very strong 
correlation between the expression of a defined nuclear gene 
from potato (called ST-LSl), encoding a component of the 
water oxidizing complex of photosystem D, and the presence 
of chloroplasts. The three most striking examples for the 
correlation of the presence of chloroplasts with the expression 
of this nuclear photosynthetic gene are the data obtained for 
the leaf epidermis, root tissue and the potato tuber. In the 
epidermis of leaves, the ST-LSl —GUS gene is expressed 
in guard cells and trichomes which contain chloroplasts, 
whereas in epidermal cells which are devoid of chloroplasts 
there was no detectable ST-LSl —GUS gene expression. This 
result also indicates that, irrespective of the nature of the 
signal which is responsible for the induction of the ST-LS 1 
gene, it most likely has to be created within the cell itself 
and does not have any dominant influence on neighbouring 
cells. This signal therefore is unlikely to be able to diffuse 
or to be transported to other cells. 
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Our observation that the ST-LSl-GUS gene can be 
expressed in parenchymatic root and tuber cells, provided 
these tissues are made to contain green chioroplasts, 
represents an important finding with respect to the relative 
importance of the morphological differentiation of cells and 
the developmental stage of the plastids with regard to 
expression of the ST-LSl gene. The observation that the 
ST-LSl — GUS gene is actively expressed in root and tuber 
cells containing chioroplasts whereas it is not expressed in 
neighbouring cells of the same type which are devoid of 
chioroplasts suggests that the presence of chioroplasts might 
be a prerequisite for the expression of the gene concerned. 
It should, however, be mentioned that they would also be 
compatible with an inverse type of control, i.e. control 
of the differentiation of the plastid by the expression of 
certain nuclear genes. These results also demonstrate that 
light— another factor often connected with the expression of 
photosynthetic nuclear genes— while essential, is not 
sufficient for induction of ST-LSl gene expression. All 
further data described in the Results are in agreement with 
the main conclusion described above, i.e. the importance 
of the presence of chioroplasts for expression of the ST-LSl 
gene. This result was obtained from the analysis in the 
homologous system (potato) as well as in the heterologous 
system (tobacco) for all tissues analysed. 

The approach used in this study, i.e. the histochemicai 
detection of jS-glucuronidase activity from a chimeric gene 
transcriptionally driven by the promoter region of the 
ST-LSl gene, was used for several reasons. 

Firstly we wanted to know whether or not the postulated 
plastidary signal acts at the level of transcription. The 
chimeric gene used as a reporter consisted of regulatory 
sequences derived from a photosynthetic gene and of a 
coding sequence derived from a prokaryotic gene. We 
assumed that a prokaryotic mRNA would not be influenced 
markedly by plant specific post-transcriptional regulation 
mechanisms. 

As outlined in the Introduction the importance of a plastidic 
factor for the expression of nuclear photosynthetic genes has 
been implied by several studies. These studies relied on the 
oxidative damage of chioroplasts by either the use of 
inhibitors of carotenoid biosynthesis or on the analysis of 
albino mutants. These previous data cannot with certainty 
exclude the possibility that the suppression of the activity 
of photosynthetic genes is due to a non-specific side effect 
of photo-oxidative damage. Our data, in contrast, were 
obtained in a -*wild-type' situation and in addition allowed 
us to monitor the expression on the cellular level. 

It is important to examine whether or not the observed 
differential expression of the GUS enzyme is exclusively due 
to the specificity impacted by the ST-LSl promoter. The 
expression of a chimeric 35S-GUS gene was therefore 
analysed in parallel and the expression patterns obtained for 
both genes were compared. This kind of analysis showed 
that the observed differential expression of the GUS gene 
results from ST-LSl promoter activity and not, for example, 
from accessibility of the substrate or differences of GUS 
mRNA and protein stability. 

Two other reports have to some extent described in a 
similar way the correlation between expression of another 
photosynthetic gene and the presence of chioroplasts. Using 
immunocytochemical methods, Aoyagi etal. (1988) showed 
that a chimeric gene consisting of the promoter of the nuclear 



photosynthetic small subunit RBCS gene fused to the coding 
sequence of the CAT gene was expressed in leaf and stem 
cells containing chioroplasts. A similar result was obtained 
by Jefferson et al. (1987) who demonstrated that treatment 
of stems with strong white light led to the tbrmation of many 
chioroplasts in cortical parenchyma cells (chlorenchyma) and 
led to an increased level of expression of a chimeric gene 
consisting of a RBCS gene promoter ftised to the GUS coding 
sequence. In these two cases the expression of the respec" 
tive photosynthetic gene could not be separated from the for- 
mation of the typical photosynthetic tissues (leaves and stem). 
Nevertheless the observation that the c/^-acting regulatory 
elements of different photosynthetic genes apparently led to 
the same kind of expression pattern as described for the ST- 
LSl gene suggests that the hypothesized control of the 
expression of the ST-LSl gene by the chloroplast could be 
a general phenomenon and might be relevant for a number 
of nuclear photosynthetic genes. The identification of tissues 
which, except for the difference of the developmental stage 
of their plastids, are very similar (tissue of green and white 
roots for example) will be very useful for the characteriza- 
tion of the signal(s) controlling the activity of nuclear 
photosynthetic genes. 

Materials and methods 

Recombinant DNA techniques 

Standard procedures were used for recombinant DNA work (Maniatis et 
aL, 1982). 

Transformation of tobacco and potato plants and tissue culture 
techniques 

The.chimeric genes were inserted in the vector BIN 19 (Bevan. 1984) and 
introduced into the Agrobaaerium tumefaciens strain pGV2260 (Deblaere 
et al., 1985) by direct transformation according to Hofgen and Willmitzer 
(1988). In order to transfer the chimeric genes to tobacco cells* leaf discs 
of Nicotiana tabacum cv. SNN were infected with the respective Agfo- 
bacterium strain and subsequently regenerated (Horsch et at., 1985). The 
transformation and regeneration of Solanum tuberosum cv. Desiree plants 
was performed as described by Rocha-Sosa et al. (1989). 

Potato and tobacco callus v^as cultivated on MS medium (Murashige and 
Skoog, 1962) supplemented with 2% sucrose and 3 mg/1 2,4D (potato) or 
1 mg/I 2,4D (tobacco) in a 16 h light/8 h dark rhythm. Suspension cultures 
were cultivated in liquid MS medium containing 2% sucrose and 1 mg/I 
2,4D in continuous dim white light. 

Histochemicai localization 

The histochemicai reactions were performed as described by Jefferson (1987) 
using X-Gluc as substrate. For the sections of plant material a cryo-microtome 
was used. The staining reactions were performed with either unfixed cunings 
or with cuttings fixed for 5- 15 min in ice-cold 2% formaldehyde, 1 mM 
EDTA in 100 mM Na -phosphate (pH 7.0). The fixed cuttings were washed 
extensively before the staining reaction. The reaction times varied between 
2 and 16 h. 
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Transgenic analysis has reached an advanced 
state in plants. In the ten years since transforma- 
tion with a chimeric selectable marker was re- 
ported, the basic tools required to insert and ex- 
press foreign genes have been developed. 
However, plant genetic engineers still lack impor- 
tant tools that are common in other systems; one 
that has only recently emerged is the ability to 
regulate expression of transgenes with exogenous 
chemical agents. This review will briefly cover the 
expanding literature on ' control of foreign gene 
expression in plants by appUcation of synthetic 
compounds. 

The goal of chemical gene induction systems is 
to provide the ability to manipulate levels of gene 
expression in order to understand better the func- 
tions of individual genes, and to facilitate the pro- 
duction of large amounts of a specific gene prod- 
uct: The basic concept underlying such schemes 
.is isolation of a cis- acting sequence that operates 
as the key regulator of a gene with which it is 
naturally associated, followed by attaching the 
c^-acting element to a gene of interest. This re- 
sults in expression of the engineered gene in a 
fashion similar to the natural, chemically regu- 
lated gene. 

In other well-studied biological systems, the 
ability to alter gene expression by simple manip- 
ulation of the growth medium or addition of a 
chemical has found widespread use. Conamon 
systems include the lac operon in Escherichia coli 
[1], the GAL i, 4, 10 regulon in yeast [2], and the. 
glucocorticoid receptor/response element in 
mammalian cells [3]. An important commercial 



use for chemical gene regulation is the production 
of recombinant proteins in fermentation settings 
(see e.g. [4]). Chemical control also provides the 
ability to study effects of ectopic expression of a 
specific promoter, spectacularly demonstrated in 
the 'super mouse' that arose from fusion of the 
metallothionein promoter to a rat growth hor- 
mone gene [5]. Thus, external regulation of gene 
expression serves the needs of both appHed and 
basic science. 

Combinations of m-acting regulatory sequence 
and exogenous chemical regulator have been dif- 
ficult to find in plants. An optimal combination of 
chemical inducer and target gene results in a 
tightly regulated system with very low uninduced 
expression that increases rapidly to high levels 
upon application of the inducer. The metabolic 
principles that underlie chemical gene regulation 
in microbes do not readily extrapolate to plants. 
For instance, simple inducers of catabolic pro- 
cesses (such as mono- and disaccharides) which 
are so useful in microbial systems are relatively 
useless in photoautotrophic organisms. More- 
over, the possibilities of regulating the environ-' 
ment of auxotrophs in the field are much more 
limited than in fermentation systems. Transfer to 
special growth conditions for the sole purpose of 
*gene induction will not be generally useful in ag- 
ricultural settings, where conditions are optimized 
to maximize plant yield. Starvation for a partic- 
ular nutrient or treatment with a chemical that 
produces phytotoxicity will be acceptable only in 
special situations. Natural plant metabolic sig- 
nals and derivatives thereof are likely to be very 
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useful for regulating foreign gene expression; un- 
fortunately, only a few such compounds and their 
target genes have been elucidated. Given the pau- 
city of known natural regulators of plant gene 
expression, synthetic compounds have to date 
been more effectively used. 

Two basic classes of chemical gene regulation 
can be distinguished: endogenous systems, which 
use regulatory signals from plant genes that re- 
spond to synthetic chemical treatment, and exo- 
genous systems, which rely on elements from 
genes from other kingdoms, coupled with chem- 
icals that have no effect on expression of native 
plant genes. Endogenous regulatory sequences are 
attractive in that they can be relatively easy to 
manipulate. For instance, the addition of a 5' 
promoter element can be all that is needed to 
control a foreign coding sequence. On the down- 
side, the use of endogenous plant regulatory se- 
quences means that the native genes that are nor- 
mally linked to these regulatory sequences are 
also induced upon addition of the chemical reg- 
ulator. Thus, it is important to pick not only a 
potent inducer/regulatory sequence combination, 
but also an inducer of a class of genes that do not 
adversely affect the development of the plant. In 
addition, background levels of gene expression 
driven by the cw-^acting sequence must not be 
overly sensitive to physiological changes in the 
plant that result from environmental fluctuation 
or other stresses. The basis for choosing an in- 
ducer that fills these criteria is largely empirical, 
and very few have been tested in intact plants to 
date. Four possibilities have been documented in 
the literature, and are reviewed briefly below. 

Immunization compounds are chemicals that 
induce the systemic acquired resistance (SAR) 
response in plants. SAR is a broad resistance, 
effective against a variety of pathogens, that is 
induced by an initial pathogen infection [6] or 
chemical treatment. The inducing chemicals can 
be of natural origin, such as salicylic acid, or can 
be synthetic compounds, such as 2,6,-dichloroi- 
sonicotinic acid (IN A) [7]. Treatment with a 
pathogen or an immunization compound induces 
the expression of at least nine sets of genes in 
tobacco, the best characterized species [8]. Dif- 



ferent numbers and types of genes can be exr 
pressed in other plants [9; 10]. 

The promoter region of one tobacco gene, en- 
coding pathogenesis-related (PR) protein la, has 
been demonstrated to confer chemically-inducible 
expression on the /^-glucuronidase (GUS) re- 
porter gene in laboratory settings [11-13]. We 
have shown that a PR- la promoter/GUS fusion 
in transgenic tobacco behaves in the field as it 
does in the lab, reaching high expression levels 
after induction by either salicylic aCid or INA 
(Fig. 1). Moreover, PR-la promoter has recently 
been shown to drive chemically inducible expres- 
sion of the insecticidal CrylA(b) protein of Ba- 
cillus thuringiensis [14]. This is the only example 
in the literature of chemical regulation of a po- 
tentially important transgenic agricultural trait. 
Plants expressing B. thuringiensis toxin specifi- 
cally in response to an inducing stimulus may be 
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. Time after induction (days) 

Fig. L Field performance of a PR-la/GUS chimeric gene in 
transgenic tobacco. Six-week-old plants of a homozygous 
transgenic line containing a - 903 PR- la promoter fragment 
fused to GUS [13] were transplanted to the field 7 days be- 
fore induction. Salicylic acid was applied at 50. mM; INA as 
a wettable -powder formulation consisting of 25 % active in- 
gredient at 1 mg/ml. Each point represents , the average of 
duplicate determinations from samples pooled on each day 
from three replicate field plots. SA,. salicylic acid; mu, meth- 
ylumbelliferone. 
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advantageous compared to constitutive express- 
ers with respect to resistance management. Pop- 
ulation genetic models in which B. thuringiensis 
toxin-resistant alleles are disadvantageous com- 
pared to wild-type (sensitive) alleles predict that 
resistance will spread more slowly when insect 
populations are given refuge from continual se- 
lection for resistance (reviewed in [ 15]). Thus, the 
ability to induce expression of the toxin would 
provide temporal refuge from selection, which 
should decrease the rate at which a population 
evolves toward resistance. 

The PR- la promoter/chemical inducer combi- 
nation is also likely to find basic uses in plant 
biology. An INA-inducible PR-1 gene has been 
isolated from Arabidopsis (S. Uknes, E.R. Ward 
and J, A. Ryals, unpubl. data). In this intensively 
studied genetic model, the ability to control gene 
expression should be a valuable additional tool 
for studying the functions of individual genes of 
interest. 

Safeners are chemicals known to induce the 
expres^sion of enzymes involved in metabolism or 
detoxification of certain herbicides. The genes in- 
duced typically encode glutathione ^-transferases 
[16, 17], cytochrome P-450 mixed-function oxy- 
genases [18-20], or other proteins of unknown 
fuaction. One group has reported the isolation of 
cDNAs that respond to safener treatment [21]. 
However, the level of induction, for these genes is 
not as high as that seen for SAR-related genes 
induced by immunization compounds, which can 
be induced as much as 10000-fold over back- 
ground [8, 10] (J,A. Ryals et aL^ unpubl. data). 
^ No reports of chimeric constructions using 
safener-inducible elements have yet appeared, al- 
though such work is presumably in progress. - 
Genes involved in phenylpropanoid metabo- 
lism are known to be inducible by a variety of 
biotic and abiotic inducers [22]. Lamb and co- 
workers showed that a bean chalcone sjmthase 
promoter fused to the E. coli uidA gene in trans- 
genic tobacco was inducible as much as 18-fold 
by pathogen infection, glucan elicitor from Phy- 
tophthora megasperma^ and HgCl2 [23]. Thus, in 
principle, regulatory sequences from the phenyl- 
propanoid pathway could be coopted for other 



uses. Unfortunately, induction of the phenylpro- 
panoid pathway may lead to accumulation of un- 
desirable metabolites that are harmful to normal 
^ plant growth [22], 

Work from the laboratory of C.A. Ryan over 
the past twenty years has focused on the wound 
induction of proteinase inhibitors in tomato and 
potato [24], Experiments with transgenic tobacco 
showed that cis-acting sequences of a potato gene 
could confer wound inducibility on other genes, 
arid that the relevant regulatory signals lay in the 
3' end of the gene [25]. Recently, outstanding 
progress has been made in elucidating the nature 
of the chemicals within the plant that signal 
wound induction systemically. The volatile lipid 
metabolite methyl jasmonate was found to induce 
expression of protease inhibitors in several plant 
species [26]. Despite the basic interest in this 
discovery, dosage and extent of coverage will 
probably be difficult to manipulate in systems 
using volatile compounds for artificial gene con- 
. trol, especially in field settings. More significantly, 
after an exhaustive search for the in vivo systemic 
inducer of protease inhibitors, an 18 amino acid 
peptide was found that induces gene expression 
in amounts as small as a few femtomoles [27]. 
The discovery of systemin, as this .first peptide 
hormone from plants has been called, opens the 
door to a previously unexplored area of plant 
biochemistry. Presumably, expression of other in- 
ducible gene systems in plants may also be con- 
trolled by exceedingly potent peptides. 

Exogenous regulators, which induce genes not 
occurring naturally in the plant, have the attrac- 
tive feature of inducing only the introduced trans- 
gene. Unfortunately, adapting a gene control sys- 
tem from another organism means overcoming 
the formidable hurdles of 'chemodynamics'. Spe- 
cifically, the inducing compound must be ( 1 ) taken 
up efficiently by the plant, (2) moved systemically 
to the site of action, and (3) left in an active form 
by metabohc pathways that degrade or conjugate 
xenobiotics [28]. 

Schena et al. [29] recently showed that the 
mouse mammary tumor virus glucocorticoid re- 
ceptor could confer inducibility on a truncated 
35S promoter linked to several tandem copies of 
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the glucocorticoid response element . in proto- 
plasts. These experiments were carried out by co- 
transfection into tobacco protoplasts. The pres- 
ence of the receptor gene caused dexametha^one- 
dependent induction of the GRE-driven reporter 
gene by as much as 150-fold. The absolute level 
of expression achieved was approximately 1/10 of 
that seen using a 35S promoter-driven reporter; 
presumably, higher levels of expression could be 
achieved by optimizing the conjunction of GRE 
to plant promoter elements. Similarly, the devel- 
opmental and tissue specificity of the. newly cre- 
ated inducible promoter could be varied by using 
individual elements from promoters specifically 
regulated in time or space during plant develop- 
ment. To date, however, the functioning of this 
otherwise attractive system has not been reported 
in stably transformed intact plants. 

The TnlO tet repressor/operator is a prokary- * 
otic control system that has been shown to func- 
tion in plants [30]. Tobacco plants were first sta- 
bly transformed with the tetR gene under 35S 
control. These plants expressed tet repressor at a 
level of ca. 0.01% of total cell protein. These 
TetR-expressing plants were transformed again, 
with a GUS reporter gene driven by a 35S pro- 
moter into which tet operator sequences had been 
integrated. Insertion of two tandem tetO se- 
quences between the TATA box of the 35S pro- 
moter and the start of transcription conferred 50- 
to 80-fold repressibility on the GUS gene in the 
presence of tetracycline. 

The definition of what constitutes an agricul- 
tural trait is changing quickly as genetic engineer- 
ing of plants approaches its tenth anniversary. 
Chemical control has the capability to further ex- 
pand the range of novel compounds that can be 
manufactured at commercially useful levels in 
plants. What are some potential uses of this tech- 
nology? 

One example is the use of plants as bioreactors 
to produce recombinant proteins. Van Montagu 
and coworkers showed that the neuropeptide 
Leu-enkephaUn could be produced in the seed of 
transgenic Arabidopsis and Brassica by means of 
a translational fusion to a napin-like seed storage 
protein gene [31]. Conceivably, extremely high 



levels of such a peptide, in amounts that would 
draw deleteriously on the plant's N resources, 
could be synthesized under chemical control, fol- 
lowed quickly by harvest before significant star- 
vation affected 'the crop. 

Another recent example of a novel biosynthetic 
capacity conferred on plants through genetic en- 
gineering is the production of polyhydroxybu- 
tyrate [32]. This polyester thermoplastic is syn- 
thesized in three steps from acetyl-CoA by the 
bacterium Alcaligenes eutrophus. The first activity 
in the pathway, 3-ketothiolase,'' is found in plant 
cells. Somerville and coworkers introduced bac- 
terial genes for the remaining two steps, each 
under the control of the 35 S promoter, mio Ara- 
bidopsis thaliana, creating two independent trans- 
genic lines. An Fl hybrid of these lines accumu- 
lated PHB granules in the cytoplasm, vacuole, 
and nucleus. Expression of these genes was 
clearly harmful to the plant, as manifested by re- 
duction in fresh weight between 20 and 45%. 
Thus, the ability to trigger a massive burst of PHB 
synthesis just prior to harvest might be an attrac- 
tive strategy for its high-level production. 

The chemical regulation of foreign genes will be 
especially powerful once homologous gene re- 
placement becomes a routine technique in plant 
biology. Nearly all plant c/.y-regulatory sequences 
studied to date are exceedingly complex [33]. As 
a result, a defined fragment of a promoter linked 
to a diflferent coding sequence and inserted into 
the genome in a random location hardly ever 
functions as well as it does in its native state. 
Once it becomes possible to swap a novel coding 
sequence into the milieu of a regulated locus, the 
foreign sequence will stand a much greater chance 
of being regulated like the native gene. 

The types of traits that have been introduced 
into plants to date are relatively limited. While 
many of these traits will function adequately under 
constitutive expression, others will be useful only 
if regulated. As the biochemical bases for more 
complex plant processes are discovered, increas- 
ing numbers of transgenes are likely to be created 
that will be useful only if placed under exogenous 
control. The existence of easily used, highly con- 
trollable chemical gene regulation systems will 
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further drive the development of useful regulated 
phenotypes. Thus, the field of chemical induction . 
of gene expression in plants is young indeed, and 
the traits that can be controlled using this bur- 
geoning technology are hmited only by the imag- 
inations of investigators. 
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Summary 

The complete nucleotide sequence (8024 nucleo- 
tides) of the circular double-stranded DNA of cauli- 
flower mosaic virus has been established. The DNA 
molecule is known to possess three discrete single- 
stranded discontinuities, often referred to as 
"gaps," two in one strand and one in the other. The 
sequence data indicate that gap 1, the single dis- 
continuity in the a strand, corresponds to the at>- 
sence of no more than one or two nucleotides with 
respect to the complementary p strand. The two 
discontinuities in the strand, however, are not 
authentic gaps since no nucleotides are missing, 
but are instead regions of sequence overlap: a short 
sequence (19 residues for gap 2, at least 2 residues 
for gap 3) at one terminus of each discontinuity, 
probably the 5' terminus, is displaced from the 
double helix by an tdentk:al sequence at the other 
boundary of the discontinuity. Analysis of the dis^ 
tribution of nonsense codons In the DNA sequence 
is consistent with other evidence that only the a 
strand is transcrit>ed. The coding region exteiids 
around the circular molecule from 4 map units of 
gap 1, the map origin, to rhap position 91, and 
consists of six long open reading frames. Our find- 
ings suggest, but do not prove, that the DNA se- 
quence of the open reading frames is colinear with 
viral protein sequences. The cistron for the viral 
coat protein, which is probably synthesized in the 
form of a precursor, has t>een situated in coding 
region IV on the basis of its unusual amino acid 
composition. 

Introduction 

Cauliflower mosaic virus (CaMV) is the best charac- 
terized of the rather small number of plant viruses 
containing DNA rather than RNA as genetic material 
(for reviews see Hull, 1 979a; Shepherd. 1979). CaMV 
DNA is double-stranded (Shepherd, Bruening and 
Wakeman, 1.970) and has been estimated to be 7200- 
8000 bp long (Shepherd and Wakeman, 1971; Hull 
and Shepherd, 1977; Lebeurl r et al., 1978), Both 
lin ar and circular mol aulas of similar contour I ngth 
may be found In CaMV DNA preparations (Shepherd 
and Wakeman. 1971; Russell et al., 1971; Civerolo 
and Lawson. 1978). Th circular form accounts for 



more than 90% of the material in fresh preparations 
and is the Infectious entity (Hull and Shepherd, 1 977; 
Volovltch, Drugeon and Yet, 1978); the linear DNA 
probably arises by adventitious breakage of circular 
molecules. 

An unusual property of CaMV DNA is the existence 
of short discontinuities ("gaps*') at well defined sites 
in one or the other strand of double-stranded circular 
molecules (Hull and Howell, 1978; Volovitch et al., 
1978). Typically, there are two interruptions in one 
strand and one in the other (Volovitch et al., 1978; 
Hull, 1 979b). We have chosen to designate gap 1 . the 
single break in the transcribed a strand, as the zero 
point of our restriction map of circular CaMV DNA 
(Hohn et al., 1980), as there is evidence that RNA 
transcription begins near this point (Hull et a!., 1 979). 
The two gaps in the complementary p strand, gap 3 
and gap 2, are located at 20 and 53 map units, 
respectively (Figure 2b). The positions of the three 
gaps are conserved in all CaMV isolates examined to 
date (Hull. 1979b) with the exception of CM 184, 
which has undergone a small deletion in the region of 
gap 3 (Hull etal., 1979). 

With the development of sophisticated techniques 
for constructing recombinant DNA molecules there 
has t)een a surge of interest in the possible use of 
DNA plant viruses such as CaMV as vectors for intro- 
ducing foreign genes into plants. It is evident, how- 
ever, that much will have to be learned about the 
molecular biology of these viruses and the way they 
interact with their hosts before such a plan can be 
tested. In this paper we report the complete nucleotide 
sequence of CaMV DNA (isolate Cabb B-S) and dis- 
cuss those aspects of the sequence which shed light 
upon the organization of the CaMV genome. 

Results and Discussion 
Sequence Determination 

A large numl>er of Hint I. Taq I. Mbo II and other 
double-stranded restriction fragments of CaMV DNA 
were prepared with ^^P-labeled 6' termini by treatment 
with polynucleotide kinase in' the presence of y-^^P- 
ATP. After strand separation or secondary restriction 
to separate the labeled extremities, the sequence of 
the first 100-150 nucleotides in from each 6' labeled 
terminus was determined by the limited chemical 
cleavage method of Maxam and Gilbert (1977). 
Enough data were collected to establish an unambig- 
uous sequence for the entire genome, with over 75% 
of the molecule sequenced in both strands. (Details of 
this procedure are in Experimental Procedures.) 

The complet sequence of CaMV DNA (isolate Cabb 
B-S) is shown in Figure 1 . Th sequ nee consists of 
8024 bas s and numbering b gins with the first dG at . 
the approximate 5' boundary of gap 1 . Only the se- 
quence of the p strand, which has the sam polarity 
as viral mRNA (see b low), is presented. 
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S?ATCA6AG CCATGAATCG GTTTAAGACC AAAACTCAAG AGGGTAAAAC CTCACCAAAA TACGAAAGAG TTCTTAACTC 80 
TAAAAATAAA AGATCHTCA AGATCAAACA TAGTTCCCTC ACACCGGTGA CCGACAGGAT TACCACCGTA AGGTTTCAGA 160 
ACAACATC6A AAGCGHTAC GCCAACTICG ACTCTCAACT CAASTCGTCG TACGATG6TA GATCTAAAAA GATCAAGACT 2« 
CTAAGCCTTA AAAATCTTA6 ATGTTACGAA GCCTTCCTCA GGAAGTACCT TCTGGAACAA TAAATCTCTC TGAGAATAGT ^ 320 
ACTCTATT6A GTATCCACAG GAAAAATAAC CTTCTGTaTT fiAGATGGAH TGTATCCAGA AGAAAATACC CAAAGC6AGC 400 
AATCGCAGAA TTCTGAAAAT AATATGCAAA TATTTAAATC AGAAAATTCG GATG6ATTCT CCTCCGATCT AA7GATCTCA 480 
AACGATCAAT TAAAAAATAT aCTAAAACC CAAHAACCT TGGAGAAAGA AAAGATATTT AAAATGCCTA ACGTTnATC 560 
TCAAGTTATG AAAAAAGCGT TTAGCAGGAA AAWGAGAH CTCTACTGCG TCTCGACAAA AGAATTATCA CTSGACATTC 640 
ACGATGCCAC ASGTAASGTA TATCTTCCCT TAATCACTAA GGAAGAGATA AATAAAAGAC THCCAGCTT AAAACaGAA 720 
CTCAGAAAGA CCAT6TCCAT G6TTCATCTT GSAGCfiGTCA AAATATTGCT TAAAfiaCAA TTTCGAAATG GGATTGATAC 800 
CCCAATCAAA AHGCTTTAA TCGATGATAG AATCAATTCT AGAAGAGATT GTCTTCTTGG TGCAfiCCAAA G6TAATCTA6 880 
« CATAC66TAA GTTTATGTTT ACTGTATACC CTAAGTTrCG AATAAGCCTT AACACCCAAA GACTTAACCA AACCCTAAGC 960 
CTTATTCAT6 ATTTTCAAAA TAAMATCTT ATGAATAAAfi GTCATAAA6T TATGACCATA ACCTATGTC6 TA6GATATGC 1040 . 
ATTAACTAAT AST«TCATA GCATASATTA TCAAT^AAAT GaACAATTG AACTAGAAGA CfiTATTTCAA GAAATTCGAA 1120 
AT6TCCASCA ATCTSAmC TCTACAATAC A6AATGATGA ATGCAATT6G GCCATTGATA TASCCCAAAA CAAAGCCTTA 1200 
TTASGAGCTA AAACCAAGAC TCAAATTGGT AATAACCTTC AAATAGGTAA CAGTGCTTCA TCCTCTAATA aSAAAATGA 1Z80 
ATTAGCTAGG CTAAfiCCAGA ACATAGATCT TTTAAAfiAAT AAATTAAAA6 AAATCTfiTGG AfiAATAATAT GAGCATTACG 1360 
GGACAACCQC ATGITTATAA AAAAGATAa ATTATTAGAC TAAAACCATT GTCTCTTAAT ACTAATAATA fiAAOTATGr 1440 
. TTTTA6TTCC TCAAAAfifiGA ACATTCAAAA TATAATTAAT CATCTTAACA ACCTCAATGA GATTCTAfiGA AfiAAfiCTTAC 1520 
' TCG6AATATG GMfiAICAAC TCATACTTCG GATTAAGCAA AfiACCCTTCG GAGTCCAAAT CAAAAAACCC GTCASTTTTT 16CJ0 
• AATACtGCAA AAACCATTTT TAASAGTGSyseTtGArr AaCGAGCCA ACTAAftGGAA ATAAAATCCC TmAfiAABC 1680 
TCAAAAOW:! AfiAATAAAAA GTCTAfiAAAA AfiCRATTCAA TCCTTAfiAAA ATAA6ATTGA AttAfiAatXC TTAAn A«AG 1760 
AGGAAGTTAA AfiABCTAAAA GAATCfiATTA ACTCGATCAA AEAAOGAnA AAGAATAHA TTfieCTAAAA IGGCTAATCT WO 
TAATCAGATC CAAAAAGAA6 TCICT6AAAT CCTCACTGAC CAAAAATCCA TGAAACCGCA TATAAAAGCT ATaiAGAAT 1920 
- TATTAGGATC CCAAAATCCT ATTAAAGAAA 6CTTAGAAAC CfiTTGCAGCA AAAATCGTTA ATGACTTAAC CAAGCTCATC 2000 
AATGAHGTC CHGTAACAA AGAGATATJA GAAGCCriAG GTACCCAACC TAAAGAGCAA CTAATAGAAC AACCTAAAGA 2080 
AAAAG6TAAA GBCCTTAACT TAfiGAAAATA CTCHACCCC AATTACGEAG TAGGAAATGA ACAATTAfifiA TCCTCTGGAA 2160 
ACCCTAAAfiC TTTAACCTGG CCCTTCAAA6 CTCCAGCAOS ATGGCCGAAT CAAHTTAGA CAfiAACCATT AATAG&TTTT 2240 
• " • GfiTATAATCT GQGAGAAttT TSTCTCTCAfi AAA6TCAATT CGATCTTATG ATAAGATTGA TGGAAGAGTC CCTTGACGGG 2320 
GACCAAATTA TTGATCTAAC CTCTCTACCT A6TGATAA7T TGCAfiGHGA ACAGGTTATG ACAACTACCG AAGACTCAAT 2400 
CTCGGAAGAA GAATCAfiAAT TCCTTCTAGC AATAfiSAGAA ACATCTGAAfi A^^ 2««> 
' ' ' TCGAGCAAST TCGMTGGAT CGAACAGGAG GAACGfiAGAT TCCAAAAGAA GAAGATGGTG AAGGACCATC TAGATACAAT 2560 
GAGAGAAAGA GAAAGACCCC GGASGACCGG TACTTTCCAA CTCAACCAAA GACCATTCCA GGACAAAAGC AAACGTCTAT 2640 
GGGAATGCTC AACATTGACT eCCAAACCAA TCGAAGAACT CTAATCGAC6 ACTGGGCAGC A6AAATCG6A TTGATACTCA 2720 
AGACCAATAG AGAAGACTAT CTC6ATCCAG AAACAATTCT ACTCTTGATG 6AACACAAAA CATCAGGAAT AGCCAAGGAG 2800 
TTAATCC6AA ATACAAGATG GAACCGCACT ACCGSAGACA TCATAGAACA GGTGATCGAT GCGATCTACA CCATGJTCn 2880 
, _ AGGACTAAAC TACTCCGACA ACAAACTTGC TGAGAAGAH GACGAQCAAG AGAAGGCCAA GATCAGAAT6 ACCAAGCTCC 2960 
' ' ■ AGCTCTGCGA CATCTGCTAC CTTGAGGAAT TTACATGTGA TTATGAAAAG AACAT6TATA AGACAGAACT GGCGGATTTC 3040 
* CCAr'WTATA TCAACCAGTA CCTGTCAAAA ATCCCCATCA TTGGAGAAAA AGCGHAACA CGCHTAGGC ATGAAGCTAA 3120 
CGGAACCA6C ATCTACAGTT TAG6TTTCGC GGCAAAGATA 6TCAAAGAAG AACTATCTAA AATCTGCGAC TTATCCAAGA 3200 
. AGCAGAAGAA GTTGAAGAAA TTCAACAAGA AGTGnGTAG CATCGGAGAA QCTTCAACAG AATATGGATG CAAGAAGACA 3280 
TCDVCAAAGA AGTATCACAA 6AAGCGATAC AAGAAAAAAT ATAAGGCHA CAAACCHAT AAGAAGAAAA AGAA6TTCCG 3360 
ATCAGGAAAA TACTTCAAGC CCAAAGAAAA GAAGGGCTCA AA6CAAAAGT ATTGCCCAAA AGGCAAGAAA GATTGCAGAT 3440 
• . GTTGGATCTG CAACATTGAA GGCCATTACG CCAACGAAT6 TCCTAATCGA CAAAGCTCGG AGAAGGCTCA CATCCTTCAA 3520 



CAAGCAGAAA AAT7GGGTCT CCAGCCCATT GAAGAACCCT ATGAAGGAGT TCAA6AAGTA TTCATTCTAG AATACAAAGA 

AGAMiAAGAA GAAACCTCTA CAGAAGAAAG TGATGGATCA TCTACTTCTG AAGACtCAGA CTCAGACTGA GCAGGTGATG 

AACGTCACCA ATCCCAATTC GATCTACATC AAGGGAAGAC TCTACTTCAA GGGATACAAG AA6ATAGAAC TICACTGTTT 

C6TAGACAC6 G6AGCAAGCC TATGCATAGC ATCCAA6TTC GTCATACCAG AAGAACATTG GGTCAATGCA GAAAGACCAA 

TTATGGTCAA AATAGCAGAT GGAAGCTCAA TCACCATCAG CAAAGTCTGC AAAGACATA6 ACTTGATCAT AGCCGGCGAG 

ATATTCAGAA TTCCCACC6T C7ATCAGCAA GAAA6TGGCA TCGATTTCAT TATC6GCAAC AACTTCTGTC AGCTGTATGA 

ACCAHCATA CAGTTTACGG A7AGAGTTAT CTTCACAAAG AACAAGTCTT ATCCTGTTCA TATTGCGAAG CTAACCAGAG 

CAGTQCGAGT AGGCACCGAA GGATTTCTT6 AATCAATGAA GAAACGTTCA AAAACTCAAC AACCAGAGCC AGIGAACAH 

Gap 2 

TCTACAAACA AGATAGAAAA TCCACTAGAA GAAATTGCTA TTCTTTCAGA GGGGAGGAGG TTATCAGAAG AAAAACTCTT 

7ATCACTCAA CAAAGAATGC AAAAAATCGA AGAACTACTT GAGAAAGTA7 G77CA6AAAA TCCATTA6AT CC7AACAAGA 

C7AAGCAATG 6A7GAAAGC7 7CTA7CAAGC 7CAGC6ACCC AAGCAAAGC7 ATCAAG67TA AACCCA7GAA GTATAGCCCA 

A7GGATCGCG AAGAATTT6A CAAGCAAA7C AAAGAAT7AC 7QGACC7AAA AGTCA7CAAG CCCAG7AAAA GCCC7CACAT 

QGCACCAGCC TTCTTG6ICA ACAATGAAGC CGAGAAGCGA AGAGGAAAGA AAC67ATGG7 AG7CAACTAC AAAGC7A7GA 

ACAAAGC7AC T6rAGGAGA7 GCCTACAA7C 7TCCCAACAA AGACGA6TTA C77ACAC7CA JTCGAGGAAA GAAGA7CT7C 
I 

TCTTCCTTCG AC7GTAAG7C AGGA77CTGG CAA677CTGC 7AGA7CAAGA A7CAAGACCT CTAACGGCA7 7CACATGTCC 
ACAAGG7CAC 7AC6AATGeA. A7GTG67CCC mCGGC7TA AAGCAAGC7C CATCCA7aV7 CCAAAGACAC A7GGACGAAG 
CATTTCGTGT 67TCAGAAAG TTCT67TGC6 T7TA7G7CGA CGACATTCTC G7A7TCAG7A ACAACGAAGA A6A7CATCTA 
CTTCACCTAG CAA7GATCTT ACAAAAGTGI AATCAACA76 GAATTA7CC7 7TCCAAGAAG AAAGCACAAC 7CT7CAAGAA 
GAAGATAAAC T7CCT7e67C 7AGAAATAGA 7GAAGGAACA CATAAGCC7C AAGGACA7AT CTTGGAACAC ATCAACAAG7 
7CCCCGATAC CCT7GAAGAC AAGAAGCAAC 77CAGAGA77 CT7AGGCATA C7AACA7A7G CC7CGGA7TA CA7CCCGAAG 
C7AGCTCAAA 7CAfiAAAGCC TCTCCAAGCC AAGC7TAAA6 AAAACG77CC ATGGAGATGG ACAAAAGA6G ATACCCTC7A 
■ CAT0CAAAA6 GTGAAGAAAA ATCTGCAAGG A7TTCC7CCA C7ACA7CA7C CC7TACCA6A GGAGAAGC7G ATCA7CGAGA 
ttGATGCATC AGACGAC7AC TGGGGAGG7A 7G77AAAAGC 7ATCAAAA77 AACGAAGGTA C7AA7AC7GA GT7AATT7GC 
AGATACGCAT CTGGAAGC7T TAAAGCTGCA GAAAAGAA77 ACCACAGCAA 7GACAAAGAG ACA77GGCGG 7AA7AAATAC 
7A7AAAGAAA 7T7AG7A7n A7C7AAC7CC 7GTTCATTT7 CTGA77AGGA CAGATAATAC TCA7TTCAAG A6TTTCGT7A 
A7aCAATTA CAAAGGAGAT TCGAAACHG GAAGAAACAT CAGATGGCAA QCA7GGCTTA GCCAC7AnC A7T7GATG77 
GAACACAHA AAGGAACCGA CAACCAC7TT GCGGACT7CC 7T7CAAGAfiA A77CAATAAG G77AATTCCT AA7TGAAA7C 
CGAA»TAAG ATTCCCACAC ACHGIGGCT GATATCAAAA GGCTACTGCC 7ATnAAACA CA7CTC7GGA GAC7GAGAAA 
A7CAGACC7C CAAGCATGGA GAACATAGAA AAAC7CC7CA 7GCAAGAGAA AAIACTAATG C7AGAGCTCG ATCTAG7AAG 
AGCAAAAA7A AGCHACCAA GAGC7AACG6 CTCHCGCAA CAAGGAGACC 7C7C7CTCCA CC6TGAAACA CCGGAAAAAG 
AAGAAGCAGT 7CAT7CT6CA CTGGC7ACT7 TTACGCCA7C 7CAAGTAAAA GC7A77CCAG AGCAAACGGC 7CCTGCrAAA 
GAATCAACAA A7CC6TTGAT GGC7AATATC T7GCCAAAAG A7A7GAA7TC ASTTCAGACl GAAA77AGGC CCG7AAAGCC 
ATCGGACnC HACGTCCAC ATCAGGCAA7 TCCAA7CCCA CCAAAACC76 AACC7AGCAG 7TCAGTTGCT CCTCTCAGA6 
ACGAA7CGG6 7ATTCAACAC CCTCATACCA AC7AC7ACG7 CG767A7AAC GGACC7CA76 CCGG7A7A7A CGATGAC7GG 
OGTTGTACAA AG6CAGCAAC AAACGCTG7T CCCGGAG7TG CfiCATAAGAA GmGCCAC7 AHACAGAGG CAAGAGCAGC 
AGCTGACGCG lATACAACAA GTCAGCAAAC AGATAGGHG AACTTCATCC CCAAAGGAGA AaiCAACTC AAGCCCAAGA 
GCTTTGCGAA GGCCHAACA AGCCCACCAA AGCAAAAAGC CCACTGGCTC ATGCTAGGAA CTAAAAAGCC CAGCAGTGAT 
CCAGCCCCAA AAfiAGA7C7C CT77aCCCA GAGATCACAA 7GGACGAC7T CC7CTATC7C 7ACGA7C7AG 7CAGGAAGT7 
CGACGGAGAA GGTGACGATA CCA7GnCAC CAC7GATAAT GAGAAGA7TA GCCTinCAA 7T7CA6AAAG AA7GC7AACC 
CACAGATGGT 7AGA6AGfic7 7ACGCAGCAG GTCTCATCAA GACGATCTAC CCGAGCAATA ATCTCCAGGA GATCAAA7AC 
C7TCCCAAGA AGG77AAA6A TGCAGTCAAA AGATTCAGGA C7AACTGCAT CAAGAAGACA GAGAAAGATA TATTTCTCAA 
GA7CAGAAG7 ACTAnCCAG TA7GGAC6AT TCAAGGCT7G CTTCACAAACCAAGGCAAGT AATAGAGAT7 GGAGTCTCTA 
AAAAGG7AG7 TCCCACTGAA 7CAAAGGCCA TGGAGTCAAA GAT7CAAATA GAGGACC7AA CAGAACTCGC CG7AAAGAC7 
GGCGAACAG7 TCATACAGAG 7CTC77ACGA C7CAATGACA AGAAGAAAAT C77CG7CAAC A7GG7GGAGC ACGACACGC7 
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TGTCTACTCC AAAAATATCA AAGATACAGT CTCAGAAGAC CAAAGGGCAA TTMGACTTT TCAACAAAGG GTAATATCCG 71 ?0 

6AAACCTCCT CGGATTCCAT TGCCCAGCTA TCTGTCACTT TATT6TGAAC ATA6TGGAAA AGGAAGGT6G CTCCIACAAA 7?00 

TGCCATCATT GCGAT*Aft6G AAAGGCCATC GTTGAAGATG CCTCTGCCGA CA6TGGTCCC AAAGATGGAC CCCCACCCAC 7280 

GAfifiAGCATC CTGGAAAAAS AAfiACGTTCC AACCACGTCT TCAAAGCAAG TGGATTGAI6 TGATATCTCC ACTGACGTAA 7360 

GGGATGACGC ACAA7CCCAC TATCCTTCGC AAGACCCHC CTCTATATAA GCAAGTTCAT TTCATTTGGA GAGGACACGC 7440 

TGAAATCACC AGTCTCTCTC TACAAATCTA TCTCTCTCTA TAATAATCT6 T6AGTACTTC CCAGATAAGC GAATTAGGGT 7520 " 



TCTTATAGGG TTTCGCTCAT GT6TTGAGCA TATAASAAAC CCTTAGTATC 1AT7TCIA7T TGTAAAAIAC TTCTATCAAT 7600 

AAAAIT7CTA A77CC7AAAA CCAAAATCCA G7AC7AAAAT CCAGA7C7CC TAAAG7CCC7 A7AGATC777 G7G67GAA7A 768o' 

7AAACCAGAC AC6AGACGAC 7AAACC7CCA GCCCAGACGC CG7T7GAAGC TAGAAG7ACC GCT7AGGCAG GAGGCCGTTA 7760 

GGGAAAAGAT GCTAAGGCAG GGTTGCTTAC GTTGAC7CCC CCGTAGG77T GGTTTAAArA 7CATGAAG7G GACGGAAG6A 7840 

A6GAGGAAGA CAACGAACGA TAAG6T7GCA GGCCC7C7GC AAGG7AAGAC GA7GGAAA7T T6ATAGA6G7 AC6T7ACTA7 7930 

ACT7AIAC7A 7ACGC7AAGG GAA7GCTTG7 A7nACCCTA TATACCC7AA 7GACCCC7rA TCGATT7AAA GAAA7AA7CC 8000 
Gap I 

6CATAAGCCC CCGCT7AAAA AATT 
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Figure 1 . Nucleotide Sequence of Circular CaMV ON A (Isolate Cabb B-S) 

nrlV'^TiTl^r^ Of the/Sslrand. wt,lch has^two gaps and the same polarity as viral mRNA. is presented. Numbering begins with the first dG at 
' SSh^oI^^Tf • ^''^'^ everywhere complementary to the transcribed a strand in order to facilitate con Jderat.on o 

* coding capacity but m fact, the strand is broken at gaps 2 and 3 andlhe extremities of the sequence at each gap are redundanU^ text 1^^^ 
Rgure 6). The asterisk marks a dG residue which may be paired with a modified dC residue in the complementantranT 

\ ^ Since the CaMV isolate used to establish the se- a 

\ quence has not been cloned, we were prepared to 
find some sequence variation within the viral DNA 
population. Rather surprisingly, however, not one 
clear-cut example of such heterogeneity was detected 
in the two CaMV DNA preparations used to establish 
v :tjie sequence. Very occasionaify. double signals oc- 
curred at ^ given position in a sequence gel, but 
examination of other gels covering the region in ques- 
- . t»on or its complen[ientary strand generally revealed 
such signals to be spurious. Thus, although the exist- 
. W9f P^^^^^ sequence variants affecting a small pro- 
Vs PPF*»P" of the DMA population cannot be ruled out. 

\ rhost ofihe DNA molecules of Cabb B-S isolate con- 
' ' form to th^ sequence presented in Figure 1 . Restric- 
tioh analysis of a large number of pBR322-CaMV DNA 
; recombinants has revealed little sequence variation 
among the clones, a result consistent with this conclu- 
; sion (H6hn etaL, 1980X 

V 'The Coding Region " 
V . While it is evident that the nucleotide sequence will 
^ > ultimately tell us much if not all about the properties of 
... the; CaMV genome, a straightforward extrapolation 
PNA sequence to the properties of the final 
gene products, the virai proteins, is rendered difficult 
; . by the complexitV of mRNA maturation in eucaryotes 
ax^oiir lack of knowledge of the signals governing 
thia; process. Nevertheless, analysis of the coding 
capacity of the nucleotide sequence permits the broad 
' ' outlines of CaMV igenetic organization, if not the de- 
f * tails, to be discerned. 

" Transcription qf CaMV DNA is asymmetric. There is 

agreement that virus-specified RNA present in in- 
fected turnip leaves (Hull et al.; 1979; Al Ani et al.. 
1980) or protoplasts (Howell and Hull. 1978) hybrid- 
izes exclusively with the a strand of CaMV DNA, that 
is. the strand containing only one gap. It Is reasonable 
- Jo believe that most or all of these RNA transcripts 
-encode viral proteins. Such a view is in fact borne out 
; t>yvexamination of th coding capacity of the nucleo- 
, tide sequenc . Figure 2a presents an analysis of th 
distribution of TGA. TAG and TAA t rmination codons 
; . *|i . three possible coding frames of th fi strand, 
, the>equ nee having the same polarity as would an 




Figure 2a. Distribution of Nonsense Codons in the CaMV B Strand 
Sequence 

Triplet frame 1 begins with the first residue in the sequence as 
presented in Figure 1 . frame 2 with the second nucleotide and frame 
3 with the third residue. The sequence In each coding frame was 
divided into consecutive segments of 12 triplet codons <36 nucleo- 
tides). When a nonsense codon occurs in a segment, the correspond- 
ing rectangle in the diagram is blackened (■). 

Figure 2b. Distribution of Potential Coding Regions on the Circular 
CaMV DNA Map 

Inner circles give the positions of the six long open reading frames 
Identified in (a) along with the molecular weight in kilodaHons of the 
longest possible translation product for which each could code (as^ 
suming that translation begins with first in-phase ATQ Initiation codon 
In each open region). The outer circle gives, the positions of Eco Rl 
fragments A-E and the three gaps. 
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RNA transcript of the a strand. The diagram reveals 
that, apart from a region of about 1 000 nucleotides in 
. the vicinity of gap 1 (91-4 map units), virtually ^the 
entire sequence is free of nonsense codons for con- 
siderable distances in one or another of the three 
reading frames. The longest such potential coding 
sequence, region V, is 2082 nucleotides long. Re- 
gions IV and VI are each more than 1500 residues in 
length, while region I is 1000 nucleotides, region II 
500 nucleotides and region III 400 nucleotides long 
(Table 1). By way of contrast, the sequence comple- 
mentary to the p strand, a sequence which is not 
transcribed and hence almost certainly does not code, 
contains no uninterrupted triplet reading frame of 
more than 370 nucleotides. It is noteworthy that, with 
the exception of about 1 20 nucleotides at the junction 
between regions IV and V. there is little overlap be- 
tween successive coding regions. Thus CaMV has not 
economized its genetic Information as has <f»X174 
(Sanger et al., 1 977) by using different reading frames 
of the same sequence to specify different proteins. 

We are confident that the six long uninterrupted 
reading frames identified in Figure 2a represent the 
effective protein coding potential of CaMV DNA. The 
maximum lengths of the polypeptides for which the six 
* regions may code (assuming translation starts with 
the first in-phase AUG in each region and that there is 
no read-through of termination codons) are shown in 
Figure 2b. Whether any or all of these polypeptides 
are in fact synthesized in infected tissue is uncertain 
and will probably remain so until more is known about 
the manner in which viral RNA transcripts are proc- 
essed. Evidence is presented below, however, sug- 
gesting that a part of region IV ehcodes the major viral 
coat protein. 

Virus-specific RNA transcripts synthesized in pro- 
toplasts prepared from CaMV-infected turnip leaves 
have been reported to be quite large, with molecular 
weights in the range of 2 x 1 0® daltons or greater 
(Howell and Hull, 1978; Howell, Qdell and Dudley. 
1 979). RNA molecules of this length could accom- 
modate essentially all of the coding portion of the 
CaMV genome (regions I- VI). Transcripts isolated di- 



rectly from turnip leaves, on the other hand, were 
found to be significantly smaller, sedimenting as dis- 
tinct 18S and 25S species in sucrose gradients (How- 
ell et al., 1979). Howell et al. (1979) have suggested 
that protoplasts are defective in viral mRNA process- 
ing and that the large RNA transcripts which accu- 
mulate in this system may be precursors of the smaller 
molecules. It is uncertain, however, whether this mat- 
uration process would involve the joining together 
(splicing) of discontinuous regions of the mRNA pre- 
cursor as observed for many eucaryotic mRNAs. How- 
ell et al. (1979) argue that splicing probably occurs, 
based upon their analysis of hybridization patterns in 
Southern blots between the radioactive IBS and 25S 
viral mRNAs and CaMV restriction fragments. In our 
laboratory, however, electron microscopic examina- 
tion of specific RNA-DNA heiteroduplexes between 
CaMV DNA and polyadenylated RNA fractions iso- 
lated from infected turnip leaves has so far failed to 
reveal a single instance of single-stranded DNA loop- 
ing out of such hybrids, as would be expected if the 
DNA and RNA molecules are not colinear (J. Menis- 
sier. personal communication). 

Examination of the nucleotide sequence itself pro- 
vides-two, adrhittedly weak, arguments against exten- 
sive involvement of splicing in the mRNA maturation 
process. First, the canonical splice-junction se- 

A * ^ 
quences (5') gAGGTAAGT . . . TYTYYYTXCAGG 

(Lerner et aL, 1980; Y is a pyrimidine, X is any 
nucleotide) or close variants thereof are rarely found 
in the CaMV /ff strand sequence and then not in places 
where a splice might be expected, such as across 
regions where there is a shift in the coding frame. It 
may be objected, however, that the splice junctions of 
plants and their viruses may be sufficiently different 
from those of animals and insects to escape recogni- 
tion. 

A second argument rests upon the way in which the 
reading frame .in the coding region jumps abruptly 
from one phase to another so that, in all but one case 
(the junction between regions V and VI), successive 
open regions overlap slightly or are separated from 



Table 1. Coordinates of Possible Coding Regions of CaMV DNA 



Start 




End 






Protein 
Molecular 


Open Region - Nucleotide 


Map Unit 


Nucleotide 


Map Unit 


First ATG 
Nucleotide 


Weight 
(Kilodaftons)' 


1 '331 


4.12 


1347 


16.79 


364 


38 


II 1328 


16.55 


1828 


22.78 


1349 


18 


t" 1812 


22.58 


2219 


27.66 


1830 


15 


IV 2168 


27.02 


3670 


45.74 


2201 


57 


V 3691 


44.76 


6672 


70.69 . 


3633 


79 


VI 5713 


71.20 


7338 


91 .45 


5776 


61 


" Assuming polypeptide chain synttiesis starts with the first in-phase AUG 


in each open region. 
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* one another by only a few nucleotides (Table 1 ). Thus 
it is unlikely that the CaMV genome contains noncod- 
ing intervening sequences (introns) in the primary 
coding region (map units 4-91) which are present in 
primary RNA transcripts but eliminated from mature 
mRNAs. A splicing pattern of the type observed for 
adenovirus late mRNAs. however, in which a single 
nontranslated 5' leader sequence is spliced to each 
of several alternate coding sequences (Philipsbn, 
1979), cannot be ruled out from the sequence data 
alone. ' • 

Viral Gene Products 

Additional insight into the organization of the CaMV 
genome could be gained if we could unequivocally 
equate one or more of the potential coding regions 
described in the preceding section with a viral gene 
product. One obvious candidate for consideration is 
the major viral structural protein. Early investigations 
of the architecture of CaMV virions were rather con- 
fused, withl estimations of the number of structural 
polypeptides present varying from two to seven or 
more (Tezuka and TaniguchI, 1972; Kelly, Cooper 
and Walkey, 1974; Brunt etal., 1975; Hull and Shep- 
herd, 1976). The molecular weights of the various 
polypeptides ranged from 30 to 85 kilodaltons while 
their molar ratios often depended upon the method of 
virus purification and the age of the virus preparation. 
Al Ani, Pfeiffer and Lebeurier (1 979). however, have 
recently shown that the situation is in reality much 
simpler. They argue that there is only one major CaMV 
structural protein, a polypeptide of about 42 kilodal- 
tons that we shall refer to as P42. Smaller polypep- 
tides normally associated vyith virion preparations 
were shown to have sequences in common with P42 
and no doubt arise by proteolytic degradation of the 
major species, while the polypeptides in the 70-80 
kilodalton range are probably artifactual dimers of 
P42 and its degradation products arising from incom- 
plete reduction of disulfide bonds (Al Ani et al., 1 979). 

A distinctive feature of CaMV coat protein is its 
unusually high lysine composition, which amounts to 
1 8% (on a molar basis) of the amino acid content of 
total virion protein (Brunt et al.. 1975). We have ex- 
amined the coding capacity of each of the six putative 
coding regions identified in Figure 2 andfind that only 
one, region IV. has the potential to code for a protein 
approaching this degree of richness in lysine. Figure 
3a presents the amino acid sequence corresponding 
to region IV. beginning with the first ATG initiation 
codon in the appropriate phase. This hypothetical 
polypeptide can be seen to have an extremely lysine- 
rich region n ar the carboxy t rminus (amino acid 
r sidues 333-410). Table 2 gives the amino acid 
composition calculat d for a polypeptide spanning 
this lysine-rich core and having a molecular weight of 
about 42 kilodaltons. The xact boundaries of the 
putativ coat protein polyp ptide were chosen to op^ 



timize the fit, which is excellent, between the calcu- 
lated values and those observed for viral coat protein 
(Brunt et al.. 1975). but may be adjusted slightly 
without affecting the figures significantly. The fit is all 
* the more striking if one takes into account the fact 
that the amino acid analysis was performed on- total 
virion protein, which no doubt included degradation 
products of the basic 42 kilodalton polypeptide. 

Efforts to elicit synthesis of P42 in cell-free systems 
primed with mRNA fractions from CaMV-infected 
leaves have not been successful. Such mRNA frac- 
tions do, however, direct synthesis of a virus-specific 
polypeptide with an estimated molecular weight of 
about 55 kilodaltons (P55) in a rabbit reticulocyte cell- 
free system (R. Al Ani and A. Lesot, personal com- 
munication). We consider it probable that P55 is the 
product of total translation of region IV. which could 
give rise to a polypeptide in this size range (Figure 
2b). Viral coat protein, then, would be synthesized in 
the form of a precursor, thus explaining our failure to 
obtain mature coat protein in in vitro translation sys- 
tems. Experiments are under way to discover if there 
is in fact a serological relationship between in vitro 
synthesized P56 and viral coat projtein. 

The lysine-rich core of P42 presumably interacts 
with the DNA in the intact virion. In this regard, it is 
noteworthy that the extremities of the longer sequence 
are extremely rich in glutamic and aspartic acid resi- 
dues (Figure 3). If the precursor-product relation put 
forward above for P55 and P42 proves correct, then 
the supplementary acidic residues in the precursor 
polypeptide may serve to neutralize the lysine-rich 
core in the absence of DNA, Processing of the coat 
protein precursor would, by eliminating the acidic 
terminal sequences of the longer polypeptide, leave 
the lysine-rich core of the mature coat protein free to 
interact with DNA. 

The only other CaMV gene product that has been 
unambiguously identified is a polypeptide of about 62 
kilodaltons (P62) which is the most prominent virus- 
specified translation product primed by polyadenyl- 
ated mRNA from CaMV-infected leaves (Al Ani et al.. 
1 980). An apparently similar if not identical translation 
product with an estimated molecular weight of 66 
kilodaltons has been described by Howell et al. 
(1 979). P62 can be detected in protein extracted from 
infected leaves, and cell fractionation experiments 
indicate that it is associated with the inclusion bodies 
known as viroplasts (Howell et al., 1979; Al Ani et al., 
1980). which accumulate in the cytoplasm of infected 
cells (for references see Shepherd, 1979). Howell et 
al. (1979) have used the hybrid-arrested translation 
(HART) technique (Paterson, Roberts and Kuff, 1 977) 
and cloned Eco Rl fragments to show that the greatest 
part of the sequence encoding their 66 kilodalton 
polypeptide lies within Eco Rl fragment A (map posi- 
tions 76-5; Figure 2b). Thus open region VI, which is 
mostly contained within fragment A and has the po- 
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fmet Ala 
lie Arg 
Thr Thr 
Glu Pro 
Glu Arg 
Asn He 
Leu Asp 
Thr Gly 
Asp Glu 
Asn Met 
Arg Phe 
Leu Ser 
Ser Thr 
Tyr Phe 
Gly His 
Glu Glu 
Ser Thr 



Glu Ser 
Leu Met 
Thr Glu 
61 7 Phe 
Jja Arg 
Asp Cys 
Pro Glu 
Asp He 
Gin Glu 
Tyr Lvs 
Arg His 
ij£S Lvs 
Lvs Lvs 
Lvs Pro 
Tyr Ala 
Pro Tyr 
Ser Glu 



5 

He Leu 
Glu Glu 
Asp Ser 
Glu Gin 
Lvs Thr 
Gin Thr 
Thr He 
He Glu 
Lvs Ala 
Thr Glu 
Glu Ala 
Gin L^ 
Tyr His 
Lvs Glu 
Asn Glu 
Glu' Gly 
Asp Ser 



Asp Arg 
Ser Leu 
He Ser 
.Val Arg 
Pro Glu 
Asn Arg 
Leu Leu 
Gin Val 
Lys He 
Leu Ala 
Asn Gly 
Lys Leu 
Lys Lys 
Lys Lys 
Cys Pro 
Val Gin 
Asp Ser 



10 

Thr He Asn 
Asp Gly Asp 
Glu Glu Glu 
Met Asp Arg 
Asp Arg Tyr 
Arg Thr Leu 
Leu Met Glu 
He Asp Ala 
Arg Met Thr 
Asp Phe Pro 
Thr Ser Me 
Lys Lys Phe 
Arg Tyr Lys 
GJy Ser L^ 
Asn Arg Gin 
Glu Val Phe 
Asp Umber 



Arg Phe- 
Gin He 
Ser GTu 
Thr Gly 
Phe Pro 
• He Asp 
His Lys 
Met Tyr 
Lys Leu 
Gfy Tyr 
Tyr Ser 
Asn' Lys 
Lys Lys 
Gin Lys 
Ser Ser 
He Leu 



15 

Trp Tyr 
He Asp 
Phe Leu 
Gly Thr 
Thr Gin 
Asp Trp 
Thr Ser 
Thr Met 
Gin Leu 
He Asn 
Leu Gly 
Lys Cys 
Tyr Lys 
Tyr Cys 
Glu Lys " 
Glu Tyr 



Asn Leu 
Leu Jhr 
Leu Ala 
Glu He 
Pro Lj^ 
Ala Ala 
Gly He 
Phe Leu 
Cys Asp 
Gin Tyr 
Phe Ala 
Cys Ser 
Ala Tyr 
Pro 

Ala His 
Lys Glu 



20 

Gly Glu Asp 
Ser Leu Pro 
He Gly Glu 
Pro Lys Glu 
Thr He Pro 
Glu He Gly 
Ala Lys Glu 
Gly Leu Asn 
II e^ Cys Tyr 
Leu Ser Lys 
Ala Lys He 
He Gly Glu 
Lys Pro Tyr 
Gly ^ys Lys 
He Leu Gin 
Glu Glu Glu 



Cys Leu 
Ser Asp 
Thr Ser 
Glu Asp 
Gly Gin 
Leu He 
Leu He 
Tyr Ser 
Leu Glu 
He Pro 
Val Lys 
Ala Ser 
Lys Lys 
Asp Cys 
Gin Ala 
Glu Thr 



Ser Glu 
Asn Leu 
Glu Glu 
Gly Glu 
Lys Gin 
Val Lys 
Arg Asn 
Asp Asn 
Glu Phe 
He He 
Glu Glu 
Thr Glu 
Lys Lys 
Arg Cys 
Glu Lys 
Ser Thr 



25 

Ser Gin 
Gin Val 
Glu Ser 
Gly- Pro 
Thr Ser 
Thr Asn 
Thr Arg 
Lys Val 
Thr Cys 
Gly Glu 
Leu Ser 
Tyr Gly 
Lys Phe 
Ti^ He 
Leu G1y 
Glu G1u 



Phe Asp 
Glu Gin 
Asp Ser 
Ser Arg 
Met Gly 
Arg Glu 
Trp Asn 
Ala Glu 
Asp Tyr 
Lys Ala 
Lys He 
Cys Lys 
Arg Ser 
Cys Asn 
Leu Gin 
Ser Asp 



30 

Leu Met 
Val Met 
Gly Glu 
Tyr Asn 
Met Leu 
Asp Tyr 
Arg Thr 
Lys He 
Glu Lys 
Leu Thr 
Cys Asp 
Lys Thr 
Gly Lys; 
He Glu 
Pro He 
Gly Ser 



20 



VtRAL COAT PROTEIN (?) 
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100 
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28% GLU ^ ASP 



52* 
GLU ♦•ASP 



LYSINE CORE 
, 44% LYS 

Figure 3a. Amino Acid Sequence Coded for by Open Region IV • 
The sequence begins with the first in-phase ATG codon (2201 -2203) and proceeds to the end of the open region. The two inverted triangles mark 
the beginning and the end of .the portion of the sequence which may correspond to viral coat protein. 
Figure 3b. Distribution of Acidic and Basic Amino Acid Residues in the Open Region IV Amino Acid Sequence 

Half- and full-length vertical lines pointing upward denote aspartic acid and glutamic acid residues, respectively, while half- and full-length lines 
pointing downward denote asparagine and lysine. 



tential to code for a protein of about the right size, 
would seem to be the best candidate for the viroplast 
protein cistron. Al Ani et al. (1980) observed that 
hybridization of mRNA fractions with both Eco RI 
fragments A and E Inhibits in vitro synthesis of P62. 
consistent with the aforesaid localization (Figure 2b). 
The same investigators also found, however, that hy- 
bridization with Eco RI fragment B (map units 5-30) 
also inhibited P62 synthesis, a result not easily rec- 
onciled with the sequence data unless splicing Is 
invoked. Reinvestigation of this matter using cloned 
Eco RI fragments to eliminate the possibility of con- 
tamination of fragment B by fragment A appears es- 
sential. 

The Noncoding Region 

The sequence separating the end of coding region VI 
from the beginning of coding region I does not appear 
to encode protein. Reading frames 1 and 2 in this 



portion of the sequence (map units 91 -4) are blocked 
by numerous termination codons (Figure 2). Frame 3 
contains a small open region of about 300 nucleotides 
(map units 0-4) but there is no in-phase ATG initiation 
codon. The data of Hull et al. (1979) suggest that 
transcription commences within a few map units 
downstream of gap 1 , the zero point in our sequence, 
and terminates somewhere between map units 76 and 
100. This scheme fits in well with the sequence, as 
we would expect transcription to begin at or before 
map position 4.1. the beginning of coding region I 
(Table 1). and proceed to at least map position 91, 
the end of coding region VI. About 250 nucleotides 
downstream from the final triplet in region VI appears 
the sequence AATAAA (7598-7603), whose RNA 
equivalent is to be found 15-30 nucleotides prior to 
the poly(A) tail in a great many eucaryotic mRNAs 
(Proudfoot and Brownlee, 1976), If this feature is 
similariy located in CaMV RNA transcripts, then the 
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Table 2. Amino Acid Composition of Cauliflower Mosaic Virus Coal 
Protein 



Amino Acid 


N** Residues' 


Molar % 
waicuiaieo 


Observed*' 


Lvs 


56 


1 5.91 


1 7.97 


His 


5 


1 :42 


1.04 


ArQ 


1 9 


5.40 


4.84 


A^n -4- Ami 


32 


9.09 


8.99 


1 nr 


23 


6.53 


6.57 


Ser 


16 


4.54 


4.72 


Glu + Gin 


46 


*13.07 


11-52 


Pro 


13 


3.69 


3.46 


Gly 


21 


5,96 


6 91 


Ala 


16 


4.54 


5.07 . 


Cys 


13 


3.69 


2.88 


Val 


5 


1.42 


1.38 


Met 


8 


2.27 


1.61 


lie 


26 


7.39 


6.80 


Leu 


23 


6.53 


7.49 


Tyr / 


20 


5.68 


5.65 


Phe 


10 


2.84 


3-11 


Trp 


3 







Total 355 41.417 daltons 

" Taken from Figure 3. 

From Brunt et aL. 1975. 
*= Tryptophan content was not measured. . 



termination point for transcription would fail near map 
position 95. 

With regard to initiation of RNA transcription, many 
eucaryotic mRNA coding genes have AT-rich regions 
(typically, a variant of the sequence TATA A A A), often 
flanked by GC-rich sequences. 20-30 nucleotides 
upstream of the transcription initiation point (for ref- 
erences see Benoist et al.. 1980), The CaMV DNA 
sequence between gap 1 and the beginning of coding 
region I contains several AT-rich regions but perhaps 
th most eye-catching example occurs just before gap 
1 : GCCCCCGCTTAAAAAATT (residues 8007-8024). 
For the present, we reserve judgment on the role, if 
any, of such sequences in transcription initiation until 
the 5' terminus of the initial RNA transcript has been 
characterized. 

Th Gaps 

Determination of the sequence in the vicinity of the 
gaps presented special problems. We have observed, 
as have Volovitch et al. (1979). that restriction frag- 
ments containing a gap often display perturbed be- 
havior during electrophoresis, migrating as a diffuse 
band or family of bands in the gel. We have also 
observed that proximity to a gap may render certain 
restriction enzyme sites refractory to attack. Hohn et 



al. (1980) have shown that CaMV DNA cloned in 
PBR322 (no gaps) possesses a Hind III restriction site 
within 1 1 0 nucleotides of gap 3 that was not detected 
in noncloned DNA. Sequence analysis has shown that 
the Hind III recognition sequence in fact exists in 
noncloned viral DNA (positions 1 513-1 518), but only 
once were we successful in obtaining cleavage at this 
site. Several examples of incomplete cleavage at Taq 
I sites in the vicinity of gaps were also noted. 

In spite of these difficulties, a number of short 5' 
^^P-labeled double-stranded restriction fragments en- 
compassing each of the three gaps were isolated. 
When such a fragment was subjected to strand sep- 
aration, three radioactive single-stranded fragments 
of disparate length were produced. The two smaller 
fragments correspond to the , two segments of the 
strand interrupted by the gap while the longest frag- 
ment is the continuous complementary strand. Evi- 
dently, one of the two shortened single-stranded frag- 
ments will have the 5' terminal nucleotide of the gap 
at its labeled extremity so that sequence data from 
this fragment should exactly situate the 5' limit of the 
gap with respect to the sequence of the complemen- 
tary strand. The sequence of the other shortened 
fragment, which has the restriction site as 5' terminus 
and the gap at its 3' extremity, should in principle 
define the approximate 3' limit of the gap, providing 
that the restriction site is sufficiently close to the gap 
so that the sequence can be read to its end. 
5 ' Extremities of ttte Gaps 

Sequence gels for 5' labeled fragments originating 
from gaps 1 , 2 and 3 are shown in Figure 4. Hull et al. 
(1979) have identified the 5' terminal nucleotides of 
the gaps in CaMV DNA (isolate Cabb B-JI) as dA for 
gaps 1 and 2 and dG for gap 3. We find the same 5' 
termini for the gaps in the DNA of our isolate (Cabb B- 
S) both upon sequencing gels (Figure 4) or by total 
PI nuclease digestion of the 5' ^^P-labeled fragments 
and electrophoresis of the digestion products at pH 
3.5 (data not shown). In numerous experiments, how- 
ever, the signal corresponding to the second nucleo- 
tide from the 5' latjeled end of each gap was always 
obscured by a heavily labeled diffuse band or pair of 
bands traversing all four lanes of the sequencing gel 
(Figure*4). Anomalous signals of this sort were never 
encountered for ordinary restriction fragments pre- 
pared and electrophoresed in parallel, suggesting that 
an unusual structure or modified nucleotide may be 
present at these positions in the DNA molecule. The 
nature of these unusual residues is currently under 
investigation. 

In the course of sequence determination we ob- 
served that gap 2 is absent from a portion of the DNA 
molecules of our preparations. A short Taq I restriction 
fragment spanning, the region of gap 2 was found to 
migrate as two distinct approximately equimolar com- 
ponents in polyacrylamide gels. Characterization of 
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Figure'4. Maxam-Gilbert 20% Sequencie Gels of Single-Stranded 
Eco Rl Fragments 5' ^P-Labeled at the Gaps 
The signal for the first nucleotide in the sequence is not readily visible 
in the reproduction. Asterisk indicates nucleotides which may be 
unusual or modified, (a) Gap 1 ; (b) gap 3; (c) gap 2. 

these fragments revealed that the more slowly migrat- 
ing of the two species contained gap 2. giving rise to 
three 5' labeled single-strandied fragments upon 
strand separation, two of which originated from the 
restriction cuts and the third from the gap, whereas 
the other fragment, although otherwise identical in 
sequence, consisted only of the two uninterrupted 
complementary strands. 
3' Extremities of the Gaps 

Vofovitch et al. (1 979) have reported that homopoly- 
mer tracts may be added at all three gaps in native 
CaMV DNA with terminal deoxynucleotidyl transfer- 
ase, indicating that the 3' terminal nucleotide of each 
discontinuity has a. free 3' OH group, but only in the 
case of gap. 2 have we succeeded in introducing 
enough label at the 3' terminal position for sequencing 
purposes. Nevertheless, a fairly precise localization of 
the 3' limits of gaps 1 and 3 was obtained by sequenc- 
ing short 5' labeled restriction fragments having the 
gaps at their 3' termini. (An appropriat fragment 
t rminating in gap 2 exists but could not be isolat d 
in quantities sufficient for sequence analysis J Starting 



Figure 5. Maxam-Gilbert Sequence Gels Showing the 3' Termini of 
. the Gaps 

(a) Single-stranded Eco Rl fragment 3' ^^P-labeled at gap 2; position 
of first signal corresponds to a dinucleotlde (20% gel); (b) single- 
stranded 5' .^^P-labeled HInf I fragment with 3' terminus at gap 3 (8% 
gel); (c) single-stranded 5' ^^P-labeled Bgl II fragment with 3' terminus 
at gap 1 (8% gel). 

from a Bgl II site about one hundred residues upstream 
from gap 1 on the 5' side (with respect to the discon- 
tinuous strand), we were able to read the sequence 
toward gap 1 for all but the last one or two residues 
at the 3' end of this fragment, where the hieavily 
labeled band of undegraded material obscured the 
specific signals in the sequence ladder (Figure 5c). 
The sequence so determined extends to within two 
residues of the 5' terminar nucleotide of the gap 
(Figure 6). Thus the single-stranded region separating 
the 6' and 3' ends of the gap 1 discontinuity is no 
more than one or two nucleotides in length. 

Figure 5b shows a sequence ladder for a 5' labeled 
Hint I fragment terminating at gap 3. The fragment has 
the sequence TTTTTAAGAGTGGGGGGG. . . at its 3' 
extremity (the seven dG signals in the final run are not 
readily visible in the reproduction but can be seen on 
the original film). Comparing this to the sequence of 
the complementary strand reveals that, surprisingly 
enough, the 3' terminal sequence at the discontinuity 
overlaps the first two residues of the 5' terminal con- 
tinuation of the strand (Figure 6). 

An even more sizable sequence overlap exists at 
gap 2. As mentioned above, we were successful in 
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W 1 5' - A'G C C C CCGCTTAAAAA'AfTGGTATCAGAGCCATGAATCGG- 

-T ,CGGGGGCGAATTTTT T A , CATA6TCTCGGTACTTAGCC-5' 



5'-AGGTTATCA6AA6AAAAA 



GAP 2- 5' - T A T T C T T T C A*6 A G 6 G G A 6 6 A G G'T TATCAGAAGAAAA I^^J C - 

-ATAAGAAAG T ,C T C C C C T C C T C C. A ATA6TCTTCTTTTTGAG-5' 



Figure 6. Sequence in the Vicinity of the Gaps 
The sequence corresponding to the /8 strand 
is written on the upper line and the comple- 
mentary ft strand on the lower line. Asterisk 
denotes unusual, or modified nucleotide (see 
text). Dashed arrows indicate the extent to 
which sequence could be read starting from 
6' labeled restriction cuts upstream of the gap. 



GftP3 



■GG* 



5'-CCATTTTTAA6AG T'GGGGGGG' TGATTACfcGAGCCAACT- 
-6GTAA A AATTCT CA££C^CCC.AACT AAT GAGC TCGGTTGA - S' 

.. 1620 1640 



fixing a ^^P-AMP residue to the 3' terminus of this gap 
by incubating Eco Rl fragment C with a-^^P-ATP and 
terminal transferase. The fragment with the gap at its 
3' terminus was purified by strand separation and 
sequenced (Figure 5a). When read in a 5' to 3' sense, 
the last 1 9 residues of the 3' terminal sequence of this 
fragment were found to be identical to the first 19 
residues of the 5' continuation of the strand (compare 
Figure 5a to Figure 4c). As there is no corresponding 
sequence duplication in the complementary strand, it 
follows that one or the other of the redundant se- 
quences must branch off of the double helix as a 
single-stranded tail, as shown in Figure 6. We suggest 
that the displaced strand is normally that possessing 
the free 5' OH extremity because of the relative ease 
with which the 5' extremities of the discontinuities can 
be labeled with polynucleotide kinase (our unpub- 
lished observations). 

Figure 6 summarizes the sequence in the vicinity of 
the three gaps: It can be seen that, while none of the 
sequences are identical, the 5' termini of gaps 2 and 
3 both fall in regions in which the complementary 
strand is very rich in C: CCCCCCC (1634-1628) for 
gap 3 and CCTCCTCCCC (4220-4211) for gap 2. 
These are the two pyrimidine tracts having the highest 
C content in the entire molecule. Gap 1 is also close 
to a C-rich sequence, CCCCCGC (8008-8014). but 
is separated from it by the symmetric AT tract 
TTAAAAAATT- (8015-8024) mentioned above. Jhe 
differences in sequence around the gaps presumably 
reflect differences in function. If so, the proximity of 
gap 1 to the beginning of the coding region leads 
naturally to the idea that it may be involved in initiation 
of RNA transcription, but plausible roles for gaps 2 
and 3 come less easily to mind. One interesting pos- 
sibility is that these gaps are start/stop points for 
replication of viral DNA. as it is evident that redundant 
terminal sequences like those associated with gaps 2 
and 3 could arise if a round of DNA replication pro- 
ceeds for a short way beyond the original starting 
point. 



Experimental Procedures 

Details of the procedure for propagation of CaMV (isolate Cabb B-S), 
purification of the virus by the Triton X-100-urea method (Hull. 
Shepherd and Harvey. 1 976) and extraction, of the DNA have been 
given elsewhere (Hohn et al., 1980). Digestion of, the DNA with 
restriction enzymes was performed as recommended by the supplier 
(Biolabs). After restriction, the fragments were dephosphorylated by 
incubation for 2 hr at 50°C with 0.1-0.5 units of E. coli alkaline 
phosphatase (Boehringer) per fig DNA. The reaction was stopped by 
emulsifying with phenol, phenol was eliminated by extraction with 
ether and the fragments were precipitated with ethanol. The DNA 
fragments were 5' end-fabeled by incubation at 37*0 for 30 min with 
0.5 units polynucleotide kinase . (Boehringer) per /ig DNA in the 
presence of a 2-5 fold molar excess (with respect to DNA 5' termini) 
of y-^^P-ATP (3000 Ci/mmole; Amersham). The reaction was carried 
out in 70 mM Tris-HCI (pH 7.5). 10 mM MgCI?. 1 mM spermidine. 
0.1 mM EDTA. 5 mM dithiothreitol and 25% glycerol. If fragments in 
which the 5' termini were flush or recessed were to be labeled, 25% 
dimethytsutfoxide was included in the reaction mixture: 

The ^^P end-labeled DNA fragments were separated from one 
another by electrophoresis through agarose (Hohn et al., 1980) or 
polyacrylamide (Peacock and Dingman. 1967) gels. Fragments were 
recovered from agarose gels by the method of Vogelstein and Gilles- 
pie (1979). Elution from polyacrylamide gels was by agitation of the 
crushed gel band overnight at 37**C in 500 mM NaCI. 50 mM Tris (pH 
7.9). Soluble polyacrylamide was eliminated by adsorption of the 
DNA to a small DEAE-cellulose column and subsequent elution at 
high salt concentration. Fragments were concentrated by ethanol 
precipitation with 1 0 ^9 carrier tRN A. 

The 5' labeled ends of complementary strands were separated by 
digestion with a second restriction enzyme or. more frequently, by 
strand separation. Strand separation was carried out by heating the 
DNA fragment at 90^*0 for 2 min in the presence of 20% dimethyl- 
- sulfoxide. The sample was then quick-chilled and immediately loaded 
on a 5 or an 8% (depending on fragment size) polyacrylamide gel 
(Szalay. Grohmann and Sinsheimer, 1 977). 

Singly end labeled DNA fragments were eluted from crushed 
polyacrylamide gel bands as described above except that the DEAE- 
cellulose column step was omitted. After addition of 1 0 ^g carrier 
tRNA. the fragments were ethanoUprecipitated. resuspended in 60 
mM sodium acetate and reprecipitated, and the precipitate was 
washed with 90% ethanol. Finally, the precipitate was dried in vacuo 
and resuspended in a small volume of distilled water for the sequenc- 
ing reactions. 

For labeling 3' ends, about 2 /ig of purified CaMV DNA restriction 
fragment were dissolved in 25 /tl 0.1 M potassium cacodylate (pH 
7.6), 1 mM CaCb, 0.2 mM dithiothreitol (Roychoudhury, Jay and Wu. 
1 976). and incubated at 37;*C for 10 min with 2 units of calf thymus 
terminal deoxynucleotidyl transferase (PL Blochemicals) and 50 fiCi 
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of rt-32p^ATP(New England Nuclear). An additional 2 units of enzyme 
were added and incubation was continued for 30 min. at which time 
the mixture .was supplemented with 100 nmole ATP and incubated 
for 1 0 min more. The DNA was ethanol-precipitated with 5 carrier 
tRNA and the precipitate, collected by centrifugation, was dissolved 
tn 1 00 ;il 1 M piperidihe. After heating to 90*0 for 30 min. piperidine 
was eliminated by three cycles of lyophilization. The lyophilized DNA, 
taken up in distilled water, was subjected to strand separation as 
described above. 

Base-specific chemical cleavage reactions were carried out ac- 
cording to the methods of Maxam and Gilbert (1977). using the 
methylation reaction for G. depurination for G + A and hydrazine 
reactions for C + T and C. The cleavage products were fractionated 
on 8 or 20% 0.5 mm thick polyacrylamlde sequencing gels operated 
at high voltage (Sanger and Coulson. 1978). Autoradiography was 
performed at -ZO^C with phosphotungstate intensifying screens. 
Approximately 75% of the genome was sequenced on both strands 
with special attention being devoted to those portions of the coding 
region where there is a shift in reading frame (see Results and 
Discussion). A computer program was used to search for restriction 
enzyme sites and sequence overlaps in the course of construction of 
the sequence, and another program was written to search the finished 
sequence for potential coding regions and for sequences resembling 
splice junctions (Lerner et al.. 1 980). Details concerning derivation of 
the sequence and photographs of sequencing gels are available upon 
request. 
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Summary 

A 2.4 kb fragment containing the 5'-flanking region 
and the 5'-noncoding sequence of the Vicia faba 
legumin gene LeB4 mediates high level seed-specific 
expr ssion in transgenic tobacco plants. Deleted 
derivatives of this legumin upstream sequence were 
fus d to the npt-ll reporter gene to determine the 
tissue-specific activity of the chimeric constructs 
in stably transformed tobacco plants. The results In- 
dicate the presence of positive regulatory, enhancer- 
lik CIS elements within 566 bp of the upstream 
sequence. Most importantly, however, these elements 
are only fully functional in conjunction with the core 
motif CATGCATG of the legumin box around position 
-95, since destruction of the motif by a 6 bp deletion 
in an otherwise intact 2.4 kb upstream sequence 
drastically reduces expression in seeds. At the same 
tim , low level expression in leaves is observed. The- 
ccurrence of similar CATGCATG consensus cis 
elements with alternating purine and pyrimidine base 
pairs In front of several other plant genes suggests a 
functional role of the motif in a wider range of plant 
promoters. 

Introduction 

Spatially and temporarily regulated gene expression 
programmes are the basis for development and mor- 
phology. The strictly seed-specific and development- 
dependent expression of seed storage protein genes 
provides a suitable experimental system to study differen- 
tial gene activation in plants. 

It is generally accepted that the seed specificity of 
storage protein gene expression Is primarily regulated at 
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the transcriptional level, although post-transcriptlonal 
processes can modulate the final amount of translational 
products widely (Goldberg ef a/., 1989). Current ideas 
imply complex interactions between specific frans-acting 
transcription factors with their c/s-acting target DNA 
sequences as the principal mechanism for transcription 
regulation. Several DNA fragments derived from the 5'- 
flanking regions of different seed protein genes have been 
shown to bind defined nuclear protein factors (Allen etaL, 
1989; Bustos eta/., 1989; Chen etai, 1988; Jofuku efa/.. 
1987; Jordano et aL, 1989). However, in most cases a 
causal relationship connecting trans factor binding with 
regulated promoter activity has not been demonstrated. 
The availability of extensive sequence data from 5' flanking 
regions of storage protein genes isolated from several 
different species has prompted the search for conserved 
sequence motifs, assuming that these elements might be 
involved in trans factor binding and therefore in the regula- 
tion of seed protein gene expression. Thus several sequence 
conservative, putative regulatory D^4A elements have been 
identified (for review see Okamuro and Goldberg, 1989); 
among them the legume 12S globulin gene-specific legumin 
box (Baumlein et a/.. 1986) with the intemal. highly con- 
served RY core motif CATGCATG (Dickinson efa/., 1988). 

Recently we have shown that about 1.2 kb of the 
legumin 84 (LeB4) gene upstream sequence is sufficient 
for strong seed-specific activity and that deletion derivatives 
with only 193 bp and 91 bp of upstream sequence are 
approximately 10 times less active (Baumlein eta/., 199l£0. 
For a more precise localization of the cis elements which 
might be responsible for this reduction in activity we have 
constructed and analysed a series of nevy deletions. 

In this paper we present data extending our knowledge 
of functionally important DNA sequences in the 5'-flanking 
region of the gene LeB4. In particular, we demonstrate that 
strong legumin promoter activity and probably also strict 
tissue specificity depend on the integrity of the short 
conserved CATGCATG sequence motif within the legumin 
box. 

Results 

Delineation ofcis-acting elements by 5' deletion analysis 

Eariier experimental data (Baumlein etal., 1991a) demon- 
strate the presence of functionally important elements in 
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the legumin LeB4 gene upstream sequence between about 
1 200 bp (C/al site) and 1 93 bp (EcoRV site) in front of the 
transcription start site. For a more precise characterization 
of those elements we have analysed the effect of progres- 
sive and Internal deletions within about 1 kb of the LeB4 
upstream sequence (see Figure 1) on NPT-II reporter 
enzyme levels in seeds of stably transformed tobacco 
plants. As a first approximation we interprete the changes 
in enzyme activity as a reflection of changes in promoter 
strength. 

As shown in Figure 2 the LeB4 upstream sequence can 
be deleted to position -701 without an obvious loss of 
NPT-II activity. The average enzyme activity seems to 
drop when the sequence between -701 and -566 Is 
removed. However, this transition is not statistically 
significant, and neither is the Increase In activity between 
construct -844 and -701. A significant (at the 5% level) 
reduction In expression level can be detected when the 
promoter Is shortened to -471 bp. The 95 bp sequence 
between -566 and -471 is AT-rlch (73%) and Includes 
the motif ATTAATT which partly satisfies the ATT A/T AAT 
consensus rule (Jofuku et al., 1987). The PpuMI site at 
position —492 used for the construction of the two internal 
deletions PC and PR (see Figure 1) is also located within 
this sequence. This restriction site overlaps a so-called 
GC element present in all legumin gene upstream se- 
quences surveyed (Rerle. 1989; Rerie ef a/., 1991). 

Another extremely AT-rlch (82%) region was removed 
to obtain construct -407. The enzyme levels produced by 
this construct are on average less than 1 0% compared to 
those produced by constructs -701 or -844. Construct 
-333 lacks part of a DNA motif with a 20 out of 25 bp 
homology (see Figure 1) to a promoter sequence of the 
mainly seed-specifically expressed USP gene of Vicia 
faba (Baumlein ef a/., 1991 b) with no obvious effect. Another 
significant (at the 1 % level) reduction in the expression level 
is shown by construct -232 In comparison to construct 
-279. The removed sequence does not show any obvious 
peculiarity apart from an 1 1-bp purine stretch. 

The question of whether a minimal promoter completely 
lacking the conservative legumin box is still functional was 
addressed by the analysis of construct -68. Construct 
-68 leads to significantly (at the 1 % level) reduced but still 
measurable NPT-II activities in comparison to construct 
-151. The sequence between position -151 and -68 bp 
Includes the total legumin box and an Imperfect direct 
repeat (TGTCACACACGTtcTGTCACACGT) between posi- 
tion -83 and -60 with similarity both to a motif reported 
to be present in front of several plant genes (Memelink 
et al., 1987) and to the CACA motif often found in the 
upstream regions of seed protein genes (Okamuro and 
Goldberg, 1989). The effects of even shorter promoter 
constructs on NPT-II activities in seeds have been com- 
pared in a separate experimental series. A 45-bp long 
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Figure 1. Sequence of the 5'-flanking region of LeB4 and structure of the 
LeB4 promoter deletion oonstnjcts. 

(a) Sequence of the 5'-flanl<ing region of the tegumin gene LeB4 fused to 
the npHI coding region in the Ti plasmid pGV180 by a linl<er region. The 
start points for deletion derivatives are indicated by * atx>ve the sequence 
and the position numt>er is supplemented by a restriction enzyme symbol 
in case the respective site was used to create the deletion constnict. The 
C/al site indicated at the top has t>een mapped to about 50 bp upstream 
of the given sequence but not sequenced Itself. Sequence motifs discussed 
In the text are marked by underiining and the CATGCATG motif within 
the legumin box is denoted by Italics. The linker region between the last 
nucleotide of the LeB4 5'-noncodlng region and the first two codons of the 
npt'U reporter gene are printed in lower case letters. The sequence between 
positions -689 and +56 has already been published by B&umlein et at. 
(1986). 

(b) Schematic structure of the LeB4 promoter deletion constructs used in 
this study. The arrow at the right labelled nptll ocs 3' symbolizes the 
neomycinphosphotransferase-ll reporter gene terminated by the poly- 
adenylation region of the octopine synthase gene. The other arrows 
represent LeB4 sequences upstream of the npf-// fusion point Indicated in 
(a) and labelled by the respective deletion end-points. Constructs denoted 
PR. PC, RC and BBS were created by deletions within the total 2.4 kb LeB4 
upstream region by removing the indicated restriction fragments or, in the 
case of the BBS construct, by deleting 6 bp of the legumin box core motif 
CATGCATG, as specified in Rgure 3. 
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Figure 2. NPT-W activity levels measured by 
the NPT-H gel assay In mature seeds of in- 
dependent tobacco plant transformants. 
The average value from all plants transformed 
with the constructs indicated below the 
columns Is denoted by a horizontal line. The 
constructs are defined in Rgure 1 . Statistically 
significant differences at the 1% or 5% 
significance level between consecutive 
constructs are indicated by brackets. Fifty 
micrograms of protein were used in each 
assay. To keep experimental variability as 
low as possible, all values given were esti- 
mated in a single slot-blot experiment. 



promoter (construct -45) still causes low NPT-II activity 
comparable to that of construct -68. Only the removal of 
the TATA box In construct -14 completely extinguishes 
the promoter activity. In addition, a cap site deletion 
(construct +20) is also inactive, as expected (data not 
shown). 

The progressively shortened promoter constructs de- 
scribed above necessarily change the spatial relationship 
between the transcription start site and the flanking 
vector-derived sequences as well as the sequences 
adjacent to the genomic integration site. To reduce the 
potential Influence of spatial changes we have created and 
analysed deletions within the 2.4 kb 5' flanking region (see 
Figure 1). As shown in Figure 2, both the PR and PC 
constructs show strongly decreased expression. The low 
activity of PR confirms the data obtained with the pro- 
gressive deletion constructs -471, -407, -333. -279 
and -232 and demonstrates the necessity of sequence 
elements between -492 (PpuMI site) and -193 (EcoRV 
site) for optimal promoter function. Moreover, the reduced 
expression of the PC construct indicates that additional 
sequence elements at or upstream of the PpuW site 
quantitatively affect the expression of the legumin promoter. 
Considering that sequences upstream of position -566 can 
be deleted without a significant effect (Figure 2), we con- 
dude that those additional sequence elements are localized 
closely upstream of or even overlapping the PpuMI site at 
^492. 



The legumin box core motif CATGCATG is essential for 
seed-specific promoter activity 

Assuming that sequence conservation Is an indication of 
functional importance, it has been suggested that the 
legumin box and its core motif CATGCATG are crucial for 
legumin gene expression (Baumlein etal., 1 986; Dickinson 
et aL, 1988). To test this hypothesis experimentally we 
have used a suitable unique Sph\ site overlapping the 
CATGCATG core element of the legumin box to specifically 
remove 6 bp out of the 8 bp core motif (see Figure 3) in 
the 2.4 kb LeB4 upstream sequence (BBS deletion). All of 
the 10 individual transformants analysed show low NPT- 
II activity in mature seeds, comparable in intensity to the 
enzyme levels caused by construct -151 (Figure 2). 
Surprisingly, seven out of the 10 plants transformed with 
the BBS construct also showed low NPT-II activity In 
leaves. Examples are given in Figure 4. In contrast, leaf 
activity is not found in plants carrying constructs with at 
least 700 bp proximal to the LeB4 transcription start site 
(data not shown). To exclude additional unintended 
changes within the mutated fragment as a cause for the 
low and tissue-specifically relaxed NPT-II levels, we have 
confirmed the overall Integrity of all BBS constructs by 
Southern hybridization. Moreover, the removal of the 
former Sph\ site was proven by the resistance to Sph\ 
treatment of a legumin box containing PCR fragment 
amplified from genomic DNA of BBS-transformed tobacco 
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GmG 1 y TCCAT AGO CA TGCA T/ACTGAAGAATG 
PsLegA TCCATAGCC>4rGC/\4GCTGCAGAATG 
PS Leg J TCCATAGCCi4rGC/»rGCTGAACAATG 
Vf LegB TCCAtAGCC>*rGC/»rGCTGAAGAATG 
BBS TCCAT AGCC/«******CTGAAGAATG 

Gm6CG AGC C^TGO* 

CC/4rGC>\rG 
Asg1o5 JCAT-CATG 
ZmCI TCC/irGC>*rGCAC 
TGC>*rGC/4rGCAC 
ZmRABI 7 TCCACTC/\rGC/*r 

CTC>*rGC>\rGccc 

0SRAB16 TCCACC C/»rGCCG 
TsEm TGC>*rGC>»rGCAA 
Gmaux22 CATGCAT 
SV40 AAGC>irGC>*rCTC 
AAG TATGCA 

Figure 3. CATGCATG-IIke motifs are present in front of several plant 
genes as well as in the Sph\ element of the SV40 enhancer. 
Abbreviations: GmGly, Glycine max, glyclnln gene; PsLegA, Pisum sativum 
legumin A gene; PsLegJ, P. sativum, legumin J and K genes (Thompson 
ef a/.. 1991); VfLegB, Vicia faba, legumin B gene (Baumlein et al., 1986); 
BBS, 6 bp deletion within the legumin box (this paper); GmpCG, G. max, 
^-conglycin^n gene (Harada et al., 1989); AsgloS. ^Avena sativa, 12S 
globulin gene (Schubert et al., 1990); ZmCI . Zea mays, CI regulator gene 
of anthocyan synthesis (Paz-Ares et a/., 1987); ZmRABI 7, Z. mays, afc)scisic 
add-induced gene (McCarty, personal communication); OsRAB16, Oryza 
sativa, abscisic acid-responsive gene (Mundy eta/., 1990); TsEm, Triticum 
aestivum, abscisic acid-induced wheat gene (McCarty, personal com- 
munication); Gmaux22. G. max, auxin-regulated gene (Ainley etat., 1988); 
SV40» Sph\ element in the simian virus 40 enhancer (Zenlte et al., 1986). 

plants. Thus experimental data clearly demonstrate that 
the destruction of the conservative RY motif CATGCATG 
within the 2.4 kb upstream region strongly disturbs the 
function of the legumin 84 promoter. 

7770 AT-rich RC fragment enhances the activity of a 
tmncated foreign promoter 

Earlier experimental data (Baumlein etal., 1991a) demon- 
strate that the generally AT-rich region between positions 
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Figure 4. NPT-M activity in seeds and leaves of four individual plants 
transformed with the BBS construct. 

All extracts containing 50 of protein were assayed on the same gel and 
the autoradiogram exposed for 3 days. Note that there is no obvious 
correlation in activity between seeds and leaves in each construct. 



-1200 (C/al site) and -193 (EcoRV site) exhibit a clear- 
cut quantitative effect on the basic LeB4 promoter. The 
same promoter region can also co-operate with the trun- 
cated nos promoter in the Ti-plasmid-derived vector 
pGV300 which contains 148 bp of upstream sequence still 
including the b, a, z and reversed b element configuration 
(Ebert eta!., 1987). As shown in Table 1 , the RC fragment 
in constmcts RCD+ and RCD- enhances the activity of 
the truncated nos promoter in leaves more than 25-fold, 
independent of Its orientation. Surprisingly, in seed tissue, 
the enhancing effect Is only two- to fourfold; the difference 
in NPT-II activity between the two orientations Is not 
statistically significant. In contrast to the RC fragment, a 
legumin box containing Mboll fragment (positions -155 
to -77 In Figure 1) In constructs LBL+ and LBL- does 
not (or only weakly) Interact with the nos promoter in 
seeds, whereas In leaves the reverse but not the natural 
orientation increases nos promoter activity about sixfold 
(Table 1). 

Discussion 

Several upstream elements quantitatively influence the 
legumin promoter activity 

The functional analysis In transgenic tobacco plants of a 
series of deletions covering about 1 kb of LeB4 gene 
upstream sequences specifies further earlier conclusions 
about LeB4 promoter regulation (Baumlein et al., 1991a). 
As shown in Figure 2 there Is a highly significant decrease 
In activity when constructs -701 and -471 are compared. 
The data suggest that the region distal of -566, but 
certainly distal of -701 , is of little importance for high 
promoter activity in contrast to the region downstream of 
bp -566. This region up to bp -407 is rich in AT base 
pairs. Similar AT-rich sequences have been described as 
being Involved in the regulation of genes coding for seed 
and other plant proteins. These sequences preferentially 
interact with high mobility group (HMG) proteins which 
seemingly recognize certain structural features Instead of 
specific primary sequences (reviewed by Weising and 
Kahl, 1991). Within these AT-rich sequences lies the Ppu 
Ml site-overlapping, evolutionary conserved GC element 
AAGGTCCCT(Rerie. 1989; Rerle etal., 1991). We take its 
sequence conservation and the reduced NPT-II activity of 
the PR and PC constructs (Figure 2). in which either the 5' 
or the 3' part of the PpuMI site are removed, as an 
Indication of the functional importance of the GC element. 

Another significant transition in activity, although already 
at a low level, occurs when the fragment between positions 
-151 and -68, containing the legumin box, Is removed 
(Figure 2). Whereas the BBS deletion cleariy reveals the 
importance of the legumin box core motif CATGCATG 
(see below) the role, if any, of the additional box sequences 
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Table 1. Effect of LeB4 upstream region fragments on a truncated nos 
promoter In transgenic tobacco plants 





Mean (± SEM) value of NPT-II activity (c.p.m.) 


Total number 
of plants 


Seed 


Leaf 


pGV300 


120 ± 12 


343 ± 111 


9 


LB14 


179 ± 17* 


344 ± 46 


2 


LB11 


139 ±11 


2038 ± 743** 


10 


RCD37 


270 ± 64** 


9790 ± 3975** 


4 


RCD2 


500 ±195** 


9202 ± 3298** 


4 



pGV300 Is the control Ti plasmid described In Experimental procedures 
containing the truncated nos promoter fused to the npt-ll gene. We 
fused either the legumin box containing promoter fragment -156 to 
-77 in the natural (LBL+) or the Inverse orientation (LBL-), or the RC 
fragment (-1200 to -193), again in either the natural (RCD+) or the 
inverse orientation (RCD-) In front of the truncated promoter. Fifty 
micrograms of protein extracted from leaves or mature seeds of a 
total of 29 transgenic plants were analysed by the NPT-Il gel assay. 
Significant difference, at the *5% or **1% level, between a given 
construct and. the control pGV300 in either seeds or leaves as 
calculated by the Mann-Whitney U test. 



around the core motif remains undefined. The low but 
significant NPT-II activities in seeds of plants transformed 
with the LeB4 promoter constructs -232, -151 and -68 
are in contrast to results reported by Shirsat ef at. (1989) 
and Rerie et al. (1991). These authors tested promoter 
deletions of the pea legumin gene Leg1 by estimating 
Leg1 protein levels in transgenic tobacco seeds and were 
unable to detect any expression when upstream sequences 
of only 97 bp, 124 bp and 237 bp control legumin expres- 
sion. The difference between these and our results may 
be explained by the lower detection sensitivity of the 
immunological technique used by Shirsat ef a/. (1989) and 
Rerie ef a/. (1991), although differences due to the con- 
structs used (intact gene versus chimeric gene) cannot be 
excluded. Presently we cannot fully explain the results of 
nos promoter stimulation by LeB4 promoter fragments 
(see Table 1) but we initially conclude that (I) the AT-rich 
RC fragment contains sequences which meet the criteria 
for enhancers (Muller ef a/., 1 988) in stimulating the foreign 
minimal nos promoter in an orientation-independent 
manner, especially in leaves, and (ii) there is no element 
within the RC fragment acting as a seed-specific enhancer 
In the given construct. 



The CATGCATGmotif-a key element ofthe legumin gene 
promoter 

The sequence motif CATGCATG is conserved among 
legume seed protein genes (Dickinson ef a/., 1 988) and is 
part of the 28 bp legumin box found in front of genes 
coding for 12S legume seed globulins (Baumlein et a!., 
1986). The exclusive deletion of 6 out of 8 bp of the 
CATGCATG motif within the 2.4 kb LeB4 upstream se- 



quence in front of the nptrll reporter gene leads to a 
dramatic reduction of NPT-ll enzyme levels (see Figure 2). 
However, since similar reductions are caused by pro- 
gressive deletions (-232, -151) leaving the CATGCATG 
motif intact, we conclude that this motif is necessary but 
not sufficient for optimal promoter function. These data 
also explain why we were unable to demonstrate the 
functional importance of the legumin box using progressive 
deletions only (Baumlein ef a/., 1991a) and imply that the 
legumin box core element CATGCATG can only function 
property In co-operation with additional upstream elements. 

Destruction of the CATGCATG motif also causes low 
NPT-II activity in leaves of BBS plants (Figure 4). Such leaf 
activity has been already observed in plants carrying the 
RC deletion construct (see Figure 1) as well as -193 and 
-91 constructs (Wobus et al., 1989). Relaxed tissue 
specificity was also reported for shortened patatin-1 pro- 
moter constructs (Jefferson ef a/. , 1 990) and for a truncated 
anonymous root-specific promoter (Koncz et aL, 1989). 
We favour the idea that the LeB4 promoter loses its tissue 
specificity when the promoter is turned down by the 
removal or destruction of important cis elements. However, 
we have still ndt rigorously excluded other explanations, 
such as an unknown role of the npt-ll coding sequence, 
as described for mammalian cells by Artelt etal. (1991). 

The CATGCATG motif also occurs in other plant gene 
promoters 

Although originally described as an element specific for 
legume seed protein genes, here we suggest that the 
CATGCATG motif acts as a functional module in a wider 
range of plant promoters. Figure 3 shows its physical 
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presence within the upstream regulatory sequences of 
several plant genes as well as the SV40 Sph\ enhancer 
motif. At least for the maize C1 gene It was shown that the 
CATGCATG sequence is crucial for its regulation by the 
viviparous gene product Vpl (McCarty and Carson, 1991 ; 
McCarty, personal communication). We presently favour 
the idea that either a CATGCATG-binding transcription 
factor or a structural peculiarity due to the alteration of 
purine and pyrimidine bases, or both, are involved in the 
integration of a functional transcription complex in seed 
tissue. 

Experimental procedures 

Plasmid constructs 

Standard cloning, construction and sequencing techniques have ' 
been performed following the guidelines given in Ausubel ef al. 
(1987) and Sambrool^ et al. (1989). The starting point for the 
generation of progressively deleted promoter fragments was the 
plasmid p4/12BB. described previously (Baumlein eta/., 1991a). 
p4/12BB contains, beside pUC18 vector sequences, a 2.4 kb 
upstream region with unique restriction sites for C/al (around 
-1200), PpuW (-492). £coRV (-193) and SpftI (-91) plus the 
complete 56 bp 5'-untranslated region of gene LeS4. The whole 
fragment is flanked by an upstream £coRI/Bp/ll/Smal linker 
sequence and a downstream BamHI site. After cleavage at the 
C/al site, p4/12BB was partially digested with Sa/ 31 , re-cut with 
Sma\ and recircularized. The deletion end-points were deter- 
mined by the Sanger sequencing technique. 

To create the BBS construct, plasmid p4/12BB was cut at the 
Sph\ site overlapping the CATGCATG sequence motif, treated 
with T4 DNA polymerase to resect the 3' protruding ends, and 
recircularized. The cloned products were sequenced to analyse 
the extent of the deletion. 

The PpuMI site, dividing the £coRV/C/al fragment (RC) into a 
distal and a proximal part was used to create the two intemal 
deletions, PC (removing the distal C/al/PpuMI fragment) and PR 
(removing the proximal PpuMI/EcoRV fragment). 

The deleted promoter fragments were isolated as Bgl\\/BamH\ 
fragments and cloned in the right orientation into the BglW site of 
the intemriediate vector pGVISO containing a promoterless npt-ll 
gene (see Baumlein et a/.. 1 991 a). Another strategy was applied to 
create the three promoter constructs -45. -14 and +20. In this 
case, the unique C/al site of the plasmid pGV180/legP FL (Baumlein 
et aL, 1991a), containing the same LeB4 sequences as plasmid 
p4/12BB described above, was used as the start point for the 
partial Bai 31 digestion. Again the digestion products were cut 
with Smal to remove the upstream sequences, gel-purified, re- 
circularized, transfomied and the deletion end-points determined 
by sequence analysis. 

To test the influence of several LeB4 promoter fragments on a 
truncated foreign promoter, we used the enhancer trap vector 
pGVSOO, originally designed by Allan Caplan, Rijksuniversiteit 
Gent. In this plasmid, which was derived from the pGVI 80 vector 
(Baumlein ef a/.. 1991a; Hemiian ef a/., 1986) the npf-ll reporter 
gene is driven by a truncated nos promoter. Using a suitable 
Ssfll site the nos promoter was shortened to a length of 148 bp, 
still including the b, a. z and reversed b sequence elements 
described to be important for (albeit reduced) promoter activity 
(Ebert ef a/., 1987). Both the EcoRV/C/al fragment (RC) and the 
legumin box-containing MtJoW fragment (LBL) spanning from 



position -156 to -77 have been cloned in either orientation in 
front of this truncated nos promoter. 

Plant transformation 

The Intermediate plasmids were transferred into the Agrobacter- 
ium strain pGV2260 by triparental mating and used for leaf disc 
transformation of Nicotiana tabacum cv. Havana as described 
previously (Baumlein ef a/„ 1991a). The integrity of all constructs 
was checked both In Agrobacterium and in the plants using 
Southern hybridization and PGR techniques. 

NPT'll assays 

NPT-II activity was detected in 100 mg of tissue. Equal amounts 
of protein detemnined by the Bradford assay, were assayed for 
NPT-II activity either by the gel test (Reiss ef a/., 1984) or the dot 
technique (Piatt and Yang, 1987). For quantification, the radio- 
activity of cut filter spots was counted. Seed NPT-II activity was 
determined from each individual transformant and the grouped 
values compared by the Mann-Whitney U test. In another experi- 
ment, equal amounts of seeds (100 mg each) of all transformants 
harbouring the same construct were mixed, extracted and 
analysed on a single gel. The prinicipal results (not shown) did not 
deviate from those shown in Figure 2 for individual transformants. 
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Summary. We have isolated a novel gene, denoted USP, 
from Vicia faba var. minor, which corresponds to the 
most abundant mRNA present in cotyledons during ear- 
ly seed development; however, the corresponding pro- 
; tein does not accumulate in cotyledons. The character- 
ized USP gene with its two introns is 1 of about 15 
members of a gene family. A fragment comprising 637 
bp of 5' flanking sequence and the total 5' untranslated 
. region was shown to be sufficient to drive the mainly 
. seed-specific expression of two reporter genes, coding 
: for neomycin phosphotransferase II and /?-glucuroni- 
; dase, in transgenic Arabidopsis thaliana and Nicotiana 
, tabacum plants. We showed that the USP promoter be- 
comes active in transgenic tobacco seeds in both the 
embryo and the endosperm, whereas .its activity in Arabi- 
; dopsis is detectable only in the embryo. Moreover, we 
demonstrated a transient activity pattern of the USP 
promoter in root tips of both transgenic host species. 

Key words: Arabidopsis thaliana - )5-Glucuronidase - 
:iloot tip - Seed protein gene - Vicia faba 



I. 

I Introduction . 

Despite their apparent morphological simplicity, plants 
^express organ-specific and developmentally regulated ge- 
•netic programs comparable in complexity to those in 
animal systems (Goldberg 1988; Goldberg et al. 1989). 
In an attempt to understand this complexity, the gene 
families coding for seed proteins have been used as a 
model experimental system in plant molecular biology 
(Goldberg et al. 1989). These studies are elucidating the 
basic principles of the regulation of gene expression dur- 
ing embryogenesis and may provide information that 
can be applied to improvement of seed protein quality. 

As in most dicotyledonous plants, the seed storage 
protein fraction of the fava bean Vicia faba var. minor 
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is dominated by its globulin components (Miintz et al. 
1986). Structural and functional data are available for 
the gene families coding for the major 1 IS legumin and 
the 7S vicilin proteins (Wobus et al. 1986; Baumlein 
et al. 1986, 1987; NVeschke et al, 1987; Heim et al. 1989). 

We have recently described cDNA clones specific for 
an Unknown Seed Protein {USP) (Bassuner et al. 1988). 
The corresponding genes are transcribed into the most 
abundant mRNA present during early seed development 
with a time profile similar to that of vicilin mRNA. 
In spite of the abundance of the mRNA, a similarly 
abundant protein of the expected size (30 kDa) has not 
been found (Bassuner et al. 1988). This observation indi- 
cates that expression levels are controlled both by tran- 
scriptional and extensive post-transcriptional processes. 
As a first step in revealing the underlying regulatory 
mechanisms we sequenced a USP gene and its fianking 
regions. In addition, we fused the USP promoter region 
to bacterial reporter genes, and describe the complex 
seed and root tip-specific expression in transgenic tobac- 
co and Arabidopsis plants revealed by the histochemical 
colour reaction for ^-glucuronidase (GUS) activity. 

Materials and methods 

Gene isolation and sequencing. The recombinant phage 
VBO.l was originally isolated from a phage library of 
the field bean {V faba var. minor) genome (Baumlein 
et al. 1986), using a t/5P-specific cDNA as probe (Wo- 
bus etal. 1986). A 3.5 kh Pstl fragment containing a 
member of the USP gene family was subcloned in the 
phage vector M13mpl8 for sequencing and in pUC18 
for use in further constructions. 

The Ml 3 phage insert was sequenced in both orienta- 
tions using systematic deletions (Hong 1982) and the 
chain termination method (Sanger et al. 1977). For the 
processing of the seqiience data a modified Pustell com- 
puter program was used (Pustell and Kafatos 1986). The 
transcription start site was determined by primer exten- 
sion (Ausubel etai: 1987) using the synthetic primer 
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5'CAAACTCCATTTGACTGGCT3'. A known se- 
quence ladder was used as size marker (Fig. 2). 

Plasmid constructs and generation of transgenic plants. 
For the construction of chimaeric genes consisting of 
the 5' sequences (-637 to -hSl) of the USP gene fused 
to reporter genes encoding neomycin phosphotransfer- 
ase II (NPTII) or GUS, a unique BstXl site at the ATG 
start codon was converted into a Bgl\\ site by the inser- 
tion of a Bgn\ hnker into the blunt-ended BstXl site 
and the coding region together with the 3'-flanking se- 
quence was removed. From the resulting plasmid, the 
flanking and the 5'-nontranslated regions can be ob- 
tained as a 680 bp BamHI-Bglil fragment. This fragment 
was cloned in the appropriate orientation upstream of 
the pfomoterless nptll gene in the intermediate vector 
pGVlSO. This vector is a derivative of pGV150 confer- 
ring hygromycin resistance (Herman et al. 1986). For 
the uidA construct the BamHl-BgUl fragment was blunt- 
ended and cloned into the blunt-ended Sad site of 
pGUSl (Peleman et al. 1989). The resulting ^mdlll- 
BamUl fragment containing the chimaeric USP-uidA fu- 
sion was used to replace the HindlU-Bgai fragment of 
the binary vector pGA472 (An et al. 1985). The plasmids 
were transferred into the Agrobacterium strain pGV2260 
(Deblaere et al. 1985) by triparental mating and used 
for the transformation of Nicotiana tabacum cv. Havana 
by the leaf disc method (Horsch et al. 1985) and Arabi- 
dopsis ecotype Columbia by the root transformation 
method (Valvekens et al. 1988). The integrity of all con- 
structs was checked both in Agrobacterium and in plants 
by/Southern hybridization (data not shown). 

Genomic blots. To determine the size of the gene family, 
10 ^ig genomic V.faba DNA was digested with an excess 
of EcoKl, BamHl, HindlU, Bglil, or Sphl, Lambda 
DNA added to an aliquot of the reaction mixture was 
used to check that digestion was complete. After blotting 
on CCA paper (Hunger et al. 1986) the filter was hybrid- 
ized (2 X SSC, 65° C) with the 3.5 kb Pstl fragment la- 
belled to a. specific activity of 10® cpm/|ig by random 
priming (Feiriberg and Vogelstein 1983) and exposed to 
X-ray film for 2 days. 

NPTII and GUS assays, NPTII activity was determined 
from 100 mg tissue. Equal amounts of protein (deter- 
mined according to Bradford 1976) from each extract, 
were assayed by gel electrophoresis (Reiss et al. 1984) 
For the analysis of the tissue specificity of the promoter, 
tobacco seeds were hand dissected into embryo and en- 
dosperm. 

GUS assays were performed basically as described 
by Jefferson (1987). 

For the histochemical analysis, mature seeds were 
imbibed for 4 h and embedded in a 5% agarose solution 
in water without fixation. The block of agarose contain- 
ing the seed was cut in the desired orientation with a 
scalpel and fixed with Pattex Super Gel (S.A. Henkel 
N.V., Belgium) on the mounting table of a Vibroslicer 
(Laborimpex, Belgium). After slicing, the embryo and 
endosperm in the sections were separated. The sections 



(approximately 30 pm) were each placed in a drop, of 
50 mM phosphate buffer, pH 7,0 containing 1 mM 5- 
bromo-4-chloro-3-indolyl-^-D-gIucuronide (X-giuc). 
0.1 mM potassium ferricyanide, and 0.1 mM potassium 
ferrocyanide in a petri dish. Sections were incubated ai 
37° C for 10 min to 24 h in a humidified chamber and 
then mounted on microscope slid|es for photography. 
Photographs were taken with a Wild MPS51 microscope 
(Heerbrugg, Switzerland). 

Gene expression studies in intact plants were done 
using sterile seedUngs (grown in a 16 h light/8 h dark 
cycle on Kl medium; Valvekens et al. 1988) that had 
been placed directly in X-gluc solution or had been cut 
at leaf, cotyledon, hypocotyl, and root prior to incuba- 
tion to avoid penetration problems. Embryos were dis- 
sected by hand from mature seeds 4 h after imbibition 
and incubated in X-gluc at 37*^ C. 



Results 

Structure of the USP gene 

Screening of a V. faba var. minor genomic library with 
a USP cDNA clone revealed two positive phages one 
of which (>IUSP30.1) was chosen for restriction mapping 
and sequence analysis. The nucleotide sequence of the 
3.5 kb Pstl fragment is shown in Fig. 1. By comparison 
with cDNA sequences (Bassiiner et al. 1988), two in- 
trpns, 81 and 110 bp in length, were localized. The 
border sequences of both introns obey the consensus 
rules derived for plant genes (Brown 1986). A compari- 
son with cDNA clones pUSP87 and pyfcl3 (Bassuner 
etal. 1988) locates the polyadenylation site 251 bp 
downstream from the TAA stop codon (Fig. 1). This 
3'-untranslated region contains multiple and overlapping 
polyadenylation signals with plant-specific features (Jos- 
hi 1987a) 10 to 31 bp upstream from the poly(A) site. 
The TGTGTTT motif often found in 3'-flanking regions 
of plant genes (Joshi 1987a) immediately precedes the 
poly(A) site. 

The transcription start site of the USP gene was de- 
termined by primer extension experiments (Figs. 1 and 
2). As shown in Fig. 2, two bands of equal intensity 
mark either an A or a C as the potential cap site. Since 
67 out of 79 plant genes Hsted by Joshi (1987 b) employ 
A at the transcription initiation site, we chose the A 
as position -h 1 of the USP gene. Conceptual translation 
of the mRNA defines a polypeptide with no obvious 
overall homology to any other protein sequence present 
in protein databases (release 21 of the PIR and release 
12 of SWISS-PROT). We note however that the signal 
sequence coding region (-1-52 to +198; see Bassuner 
etal. 1988) is interrupted by the first intron as in a 
tomato proteinase inhibitor ! gene (Lee et al. 1986) and 
that in both genes sequences up to the intron are remark- 
ably homologous: 28 out of 39 nucleotides are identical 
and 8 out of 13 encoded amino acid residues are func- 
tionally equivalent (a two-codon "deletion" in the USP 
gene is not counted; data not shown). 

A search of the 5' upstream region revealed, besides 
a TATA box at approximately 30 bp upstream from 
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-632 GCAAAITTACACAIfGCCACTAA*C6TCT»UCCCTT6TUTTTGTTTTTGTTTUCTATGTGT6IlATGrAITIGATII5C6AUAATTTrT4IATTTG 

-532 GTACTUATTlAUACACCTTTTATGCIAACGTTTGCCAACACTTAGCAAirTGCAAGITSATTAATTSArTCTAAATTATTTTTGTCflCrAAATACAT 

-432 ATttIMTCUCTG6*A*§IujTATnGCTAATATlICIACTATAGG*BAATTAAAGTGA6TGAAl^^ 

-332 GCAATaTGCATGGATIKCArATACACCAAACAnCAATAATTCTTGAGGAUAT^^ 

-232 AAASGTTTAGTAATTTTTMA^CAACAArGTTACCACACACAASTTTrGAGGTq^TKAgGATGC^^ 

-132 GATTTGCATGGAAGCCAT^FGUAijfACC^ 

-32 WT*T«TGA66ATTTIGCMT*CTITC*TiaWAaCrCACrMSTrrTACACGAIT* 

" t» £ F A H L 

69 CACTBTTTMTCTCrCTTTieCCTUBCTAUTATrn^ 

169 <^TGGCTITCGISGG«JCACTfiKAWiaTa 

*-*F»GITATtSSGEOYMQSII(PNTPLPKrFS0L 
269 TAATTCCr«rT*6rAATrTAAACT*TAACITreAiUTTA*TCTCABAAIMTTGTire^^ 

369 TGTAAAArTAACATGGCAGGTGGAATAACTAATAGTCTAaXATCAAAAGTGAAGAAIIGAAGCMTACTOACSCTGTTTrTT^ 

SGITNSLPIK5EELK0YSILFFEH0LH 

*3 "«^*J**;"!""CTTGGGCACACTAATTCTGTAGGAAGTAIAATTC6ACCArTCAa 

PBKNFML GHTNSVGSIIflPF rKSHOBV fOSIKLA 

569 *«T*AASAGAAACAAAGTCTTGA£6ACTTTTGTTArASTCCAACIGCTATAGCA6AACACAAACATT^ 

•HKEKOSLEOFCrSP TAIAEHKHCVSSLKSMIDfl' 
669 6TaTtTaC*THT6GAAST«CA«6ArTAtt6C«TTrCAA6TAACI^^^ 

VISHFSSTKIXAISSNFAPrODQyvVEOVKKVe 

769 JTAATGCAfiTGA7GTSTCArAGATTGAATnTBAAAA66T»6TATTaATrGttAIXiWETGCGT6ATACA^ " 

ONAVMCH RlNFEKVVFNCHOVflOrrAfVVStVAS 
869 AeATGGAACTAAAACIAAGfiCnTAACAGTCTGCCAaATGACACTAGAfiGIATGM 

OGTKTKALTVCHHDIPGHNPELLTEALEVIPGT 
969 6TaiT6IAreTaTTICArTKCMT«*EO^ 

VPVCHFI GIIKAAANVPNHrAONLCVM 
1069 TCATCrCTCATTCAACAATrTCATGinGrSATTrGAeSATTTTCArCTTCATTTSttTCTUTBTTTTSimUftUiTe 
t169TaTAT6TATATTAAATAAAACT*GCT«aaAreGGT6T*TCa«TATCIA*TO^^ 
12S9M*TmT«rTAAUMAMrGTTOTTnBUn«I«T*TTSAT«MTATSTGCCTTA*US*TT«CAiaTO 
1369 T*MnAI«neBTT8GAW*BnTMT«fi*SCTST«C«:n*TTAAAnAIATCArcnTCAGATAGGGGTGTGC^^ 
MS9 <£a**n**r4TCCAWTTWCMTTmrMUCCCAA[XCATaTTAaTTTTAAAACCA6rGAirAATm^^^ 
1569 CC*TTMTCAAnCTG6ATCAAiXTATASGAIADCAaTrTSAAACTCTCrcniaT6ran^^ 
iramTaTATCTTCeAABATTTTGTGnrTCSOATCa:**^ 
ijm ttMGCASWnMTMAMTSaaCAMCIIOTII^ 

W9 nB»CMfiT*AW«TT*asmMaUTraunCMATTTAMTTAiaTCTroiTaTA6^ 
WeaeaTCTACTTTGAnCGGTATACTCTAfiGAlUTMTtTATTSTACrncnTDUUTCrAT^ 

?069 ^"TGTTGTrATGTATTGGTCATICAAATSGGAnraTGTAATArGAfiATSAAAfiGrTGTGAACrTGTttTATArcrMGGATCnTre^^ 

IfSS CAAASATASAATSAAGTTAAmTrATAMCTttnSWmiTmTAATTMiUCTfiGrriABA^ 

2269 mAfBABMAOMTTCMWCDBUDDBATnmreTAA^ 

^«TAT4T*TGSnC«ITSGnG6TT*TTB»TCC*TGC*«lUIITJ^ 

^69 ATCTTBTTCTfCAtXrnACCCAATTAAASAAIAASCiniAACICTESCAAABA 

|6a «TATa:AAAATlKSaTTArCSGAAAA6ATGACGTAAAarTaTMrfCGTACIAIU^ 

|B9 HnG6TATriWT6eeGTCCTAnAA6SAreAArcitt«raAAAa?ABASArAnAfiAASSSAClTm 

ma t«SnACT8nUAA6CCSATASaFAAIAnC1CrcAAA*6CTaTASAAAS^ 

»9 nATfSGCACTACTGTCTAGCBACnTCCCTGCAG 

I Fig. 1. Sequence of the USP gene. The coding region is interrupted 
I by two introns. The vertical arrow in the protein sequence marks 
i the position of the presumed signal peptide cleavage point; the 
5 vertical arrow in the 3' end of the gene indicates the polyadenyJation 
V site, and the two vertical arrows in the 5' region mark the transcrip- 
^ tion start sites. The CATGCATG box, (A) is part of an 11 bp 
purine-pyrimidine repeat represented by dots. Box B, SV40 core 
V' enhancer motif; box C, prolamin box; box D, a sequence present 
I in -front of several soybean seed protein genes, which is part of 
I an imperfect direct repeat. The putative TATA box is underlined 
^' and is located approximately 30 bp upstream from .the transcrip- 
: -tion start 
> * '. 

; the transcription initiation site, a number of conserved 
sequence elements (Fig. 1): the seed protein gene-specific 
iCATGCATG motif, which is part of a purine-pyrimi- 
t dine repeat (Dickinson etal. 1988) at positions -170. 
to - 1 77, the SV40 core enhancer motif ( - 1 54 to - 1 60) 




Fig, 2. Determination of the transcription start site by primer ex- 
tension. Using a known sequendng ladder as marker, the size, of 
the primer extension products were 61 and 62 nucleotides long 
(see Materials and methods) 



GTGGAAAG (Benoist and Chambon 1981), and two 
copies of the so-called prolamin box (Colot et al. 1987; 
Matzke etal. 1990) at positions -108 to -113 and 
-409 to -414. The sequence motifs TACCACAA and 
TACCACACA found as parts of an imperfect direct 
repeat in the USP promoter (positions -262 to - 272 
and -352 to -360) closely resemble another sequence 
element found upstream of several soybean seed protein 
genes (Goldberg 1986). The importance of all these 
structurally conserved short sequences for the tissue-spe- 
cific and development-dependent activity of the USP 
promoter remains to be determined. 



The USP gene family 

Sequence analysis of USP cDNA clones had already 
proven the existence of several USP genes (Bassiiner 
etal. 1988). Therefore, genomic hybridization experi- 
ments were carried out to estimate roughly the total 
number of USP genes in the V, faba var. minor genome, 
as shown in Fig. 3. By using different restriction enzymes 
that do not cut within the sequenced USP gene we de- 
tected approximately 15 hybridizing bands, suggesting 
that the USP gene family consists of roughly 10 to 20 
members. This is about the number determined for the 
legumin B gene subfamily (Heim et al. 1989). 
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Fig. 3. Genomic blot of Vicia faba DNA digested with difTerent 
restriction enzymes and hybridized to a fragment containing the 
USP gene, a, EcoRl; b, BamHl; c, J^wdlll; d, B^/II; c, Sph\ 
The size markers shown on the left are in kb 
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Fig. 4. Comparison of NPTII activity in equal amounts of protein 
from extracts of seeds (S) and leaves (L) in two independent tobac- 
co transformants 



The USP 5' -flanking region- confers seed-specific 
expression on two reporter genes 

We fused a BamUJ-Bglll fragment containing 637 bp 
of the 5'-flanking and the total 5'-untranslated region 
of 51 bp to a promoterless nptll gene in the intermediate 
vector pGVlSO (see Materials and methods) and used 
it to transform N. tabacum. All ten hygromycin-resistant 
tobacco plants regenerated carried one to five copies 
of the chimaeric gene (data not shown). With fe\\; quanti- 
tative differences, all plants produced high levels of 
NPTII only in seeds as shown in Fig. 4. We never found 
NPTII activity in leaves, even with a tenfold * higher 
amount of total protein in the assay. Plants transformed 
with pGVlSO alone were NPTII negative in both seeds 
and leaves. In order to determine the distribution of 
promoter activity in seed tissues, we hand-dissected de- 
veloping seeds transformed with the USP-nptll fusioii 
construct into embryo and endosperm. Microscopic ex- 
amination showed contamination of some of the em- 
bryos by endosperm but this did not exceed 10%. On 



the basis of equal protein concentrations we found ap- 
proximately three- to fourfold higher NPTII activity in 
embryos as compared with endo'sperm (mean of three 
experiments, data not^shown). These results were con- 
firmed by analysing hand-prepared embryos and endo- 
sperm of seeds transformed with a compfetely different 
vector (pGA472) containing the uiciA eene (codine for 
^^-glucuronidase: Jefferson 1987) fused to the USP^pvo- 
moter fragment mentioned above (see Materials and 
methods). The mean GUS activity in two experiments 
with samples from five different plants was tenfold high- 
er in embryos than in endosperm. In spite of the quanti- 
tative differences between the two experimental series 
It is evident that the USP promoter is not only active 
in the embryo but also in the endosperm of tobacco. 

Histochemical localisation of GUS activitv in seeds and 
seedlings of transgenic Arabidopsis and tobacco plants 

Arahidopsis thaliana ecotype Columbia and tabacum 
cv. Havana were transformed with a USP-uidA fusion 
gene (see above and Materials and methods) and 
seeds and seedlings were analysed for GUS activity. Ini- 
tially, the F, progeny of different independent transgenic 
lines were analysed. Since the resulting overall staining 
patterns were similar (data not shown), detailed histo- 
chemical analysis was focused on the progeny of two 
^ different transgenic lines from each of the two host 
plants. 

GUS activity in seeds. There is a substantial difference 
in the ability of the GUS substrate (X-gluc) to penetrate 
different tissues. Seeds especially do not take up X-gluc 
easily. Therefore, it is difficult to obtain reliable results - 
by staining intact tissues. The. problem can be partly 
circumvented by using a Vibroslicer (see Materials and 
methods). With this device, unfixed mature seeds can 
be cut into thin sections prior to the histochemical reac- 
tion. 

When sections of transformed Arabidopsis seeds were 
incubated in X-gluc, all cells of the embryo as well as 
the cell layer between the testa and the embryo stained 
blue. This layer is the outermost cell layer of the endo- 
sperm, the sorcalled aleurpne layer. The rest of the endo- 



Fig. 5A~J. Histochemical localization of GUS activity in embryo, 
endosperm and seedlings of Arabidopsis and Nicotiana. A Hand- 
prepared embryo of Arabidopsis, B Seed coat and endosperm of 
Arabidopsis; section through mature seed. C Section through ma-' 
ture seed of Arabidopsis. D Hand-prepared embryo of Nicotiana. 
E Seed coat and endosperm of Nicotiana \ section through mature 
seed. F Six-day-old seedling of Arabidopsis. G Part of root system 
of Arabidopsis; the root with blue root tip is itself a lateral root 
(not visible on picture), H Enlargement oi box in G, showing GUS- 
positive root tip of Arabidopsis. I Distal part of a side root of 
Nicotiana. J Root system of Nicotiana (20 days old), a, axis; c, 
cotyledons; e, endosperm; gm, ground meristem; m, mucilage; pd,' - 
protoderm; pp, palisade parenchyma; pv, provascular bundle; sc, 
seed coal; sp, spongy parenchyma. Arrow, GUS-positive root tip; 
double arrow, G US-negative root tip 
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sperm tissue is resorbed by the developing embryo as 
the seed matures (Vaughan et al. 1971 ; Bouman 1975). 

However, when embryo and endosperm were sepa- 
rated after slicing, but before incubation, only embryo 
cells showed GUS activity (Fig. 5 A, B). The initial result 
using intact tissue slices can, be explained by diffusion 
of the primary reaction product from cells where it is 
produced to cells where it precipitates (see Jefferson 
1987). However, when an oxidative catalyst is used (see 
Materials and methods), precipitation of the product ap- 
pears much faster and more localized. In the presence 
of the oxidative catalyst, sections with embryo and endo- 
sperm do riot stain in the endosperm. The reaction is 
seen first in the palisade parenchyma cells of the cotyle- 
dons. Subsequently, blue precipitate is localized in the 
provascular bundle and protoderm of the axis. After 
prolonged incubation, weaker blue stain is present in 
all other cells of the embryo (Fig. 5C). 

In contrast to Arabidopsis seeds, when tobacco seed 
slices were stained after separation of endosperm and 
embryo, GUS activity could be detected in both tissues, 
thus demonstrating a difference in tissue-specific expres- 
sion of the chimaeric USP-uidA gene in the two heterolo- 
gous hosts Arabidopsis and tobacco (Fig. 5D, E). 

GUS activity in seedlings. To analyse the distribution 
of GUS activity during early plant development, seed- 
lings were taken at different times after germinatioii and 
stained in X-gluc solution. 

Grerminating seedlings of Arabidopsis turn complete- 
ly 1)lue upon overnight incubation with X-gluc. Shorter 
incubation periods reveal differences. Whereas the coty- 
ledons show patches of blue stain, the radicle is uniform- 
ly dark blue (data not shown). In germinated seedlings 
(fully expanded cotyledons), the blue precipitate can be 
detected in the root with decreasing intensity towards 
the root tip, whereas the root tip itself is deep blue. 
Root hairs also contain the precipitate. Upon longer 
incubation, blue precipitate can also be observed in the 
vascular bundle, mesophyll cells and epidermal cells of 
the cotyledons and tlie hypocotyl (Fig. 5 F). 

At a later developmental stage, when the first true 
leaves occur, GUS activity is still present in the root, 
but the staining is much weaker. It is not detected in 
leaves, even when they are cut to facilitate penetration 
of the substrate. In 10% of the seedlings, weak blue 
staining is also detected in the vasctilar bundle of the 
;root. The root cap cells and the root meristem show 
the highest GUS activity. 

' During the early stages of secondary root formation 
(up to approximately ten roots per individual plant), 
GUS activity can only be detected in the cotyledons 
and in a low percentage of the root tips. We observed 
that at this stage 10%-20% of the plants no longer show 
root tip activity, 70%-80% of the plants show GUS 
activity in only a few root tips (ranging from 10% to 
90% of the root tips), and in 10%-20% of the plants, 
all of the root tips turn blue (Fig. 5 G, H). At later stages 
in development, up to bolting, these relative numbers 
shift towards plants having no root tip activity at all. 
GUS activity in the root tips does not seem to be corre- 



lated with the age of the side roots, since young side 
roots do not always have this activity. Roots of mature 
plants were always GUS negative. In all experiments 
described, the root system was cut just below the hypo- 
cotyl prior to incubation to avoid contamination due 
to GUS activity in other parts of the plant, e.g. the 
hypocotyl or cotyledons. 

In N. tabacum, the pattern of GUS activity in seed- 
Hngs is very similar to that oi -Arabidopsis. The root 
tip activity is initially very strong (Fig. 51). When the 
plant matures (two-leaf stage), activity is detected mainly 
in the root cap cells but only after prolonged incubation 
in X-gluc solution (30 h), suggesting a decrease in the 
activity of the USP-uidA construct. As described for Ar- 
abidopsis, this activity is not always present in all root 
tips. The individual plants can have root tips that stain 
blue, as well as root tips that remain uncoloured 
(Fig.5J). 



Discussion 

Embryo versus endosperm activity of the USP promoter 

To analyse the tissue-specific promoter activity within 
seeds we transformed both Arabidopsis and tobacco 
plants with two constructs containing either the nptll 
gene or the uidA gene driven by the USP promoter. 
The constructs placed in different vector plasinids share 
only the USP promoter fragment, thereby ruling out 
possible artificial influences of the neighbouring T-DNA 
sequences on the expression of the chimaeric genes. 
When isolated embryo and endosperm tissues from to- 
bacco seeds were tested for enzyme activity, both tissues 
were found to give positive results with both constructs 
but with appreciably less activity in the endosperm. The 
GUS analyses were further confirmed at the histochemi- 
cal level. In contrast to these results, Arabidopsis seeds 
transformed with the USP-uidA construct show activity 
only in the embryo but not in the endosperm cell layer. 
Within the embryos of both species all cells turn com- 
pletely blue ypon incubation with X-gluc if no catalyst 
is used. Addition of ferro-ferricyanide as an oxidative 
catalyst remarkably enhances the cell specificity of the 
reaction. First, a reaction is seen in the palisade paren- 
chyma cells of the cotyledons and in the provascular 
bundle and protoderm of the axis. After longer incuba- 
tion, weak blue staining is localized in all other cells 
of the embryo. This weak blue staining could reflect 
weaker expression of the USP-uidA gene in these cells, 
but could also be due to diffusion of the breakdown 
product of X-gluc. Similar experiments were carried out 
with seeds of Arabidopsis transformed with a chimaeric 
gene consisting of the regulatory sequences of at2S-i, 
one of the four genes coding for the 2S napins in Arabi- 
dopsis (Krebbers et al. 1988 a), coupled to the coding 
sequence of the uidA gene. In these thin sections the 
reaction was first detected in the endosperm. Subse- 
quently, blue precipitate was observed in the spongy pa- 
renchyma cells of the cotyledons and the ground mer- 
istem cells of the axis (W. Boerjan, unpublished results). 



These results indicate that the spatial expression pattern 
; observed with the USP promoter is very specific and 
reflects the amount of ^^-glucuronidase present in the 
different cell types, and is not due to artefacts such as 
penetration and differences in cell size. 

The difference in the expression pattern of a hetero- 
logous gene in two plant species may reflect differences 
in regulation due to the ability/inability of rra/w-acting 
factors to recognize the cw-acting elements of the foreign 
USP promoter in the respective endosperm. In any case, 
data obtained from gene expression studies in heterolo- 
gous host plants should be interpreted with caution and 
cannot be generalized. The situation is further compli- 
cated by the fact that we do not know how the USP 
promoter behaves in the homologous V, faba back- 
ground, because there is only a rudimentary endosperm 
. in legumes, which disappears at early developmental 
stages. The only available data pertinent to the problem 
are for soybean. In this species two seed protein mRNAs 
coding for Kunitz trypsin inhibitor and j5-conglycinin 
/ cannot be detected in the endosperm before it disappears 
4- during embryogenesis (Pereiz-Grau and Goldberg 1989). 

This^observation suggests that seed storage protein genes ' 
f of legumes become active exclusively in the embryo and, 
I as several studies liave shown (for a review see Goldberg 
I et al. 1989), that this behaviour is maintained in trans- 
I genie tobacco plants. However, this is not true for all 
I investigated genes, since iegumin genes Le^A from pea 
I (Croy. et ai: 1988) and LeB4 from fava bean (Wobus 
|:et:al^jp89; Mumlein'etaL 1990) are active in both the 
|| tobacco embryo and endosperm. Such differences in tis- 
|; sue-specific expression^^^ different seed protein 

^g^pes^are most obviously demonstrated in the case of 
1^: 0? '^^^ soybean: one is active only 

ti^fe te^ (Barker etaL 1988), another in 

|,iw5tji;tiie embryo a the endosperm (SJ. Barker and 
l/^^^-r^^^^^^rg, unpublished results). 

localisation of GUS activity in seedlings 
r^Qj^transg^riic Arabidopsis and tobacco plants 

:^l^ere:iisr^ substantial the ability of the 

QWS substrate to penetrate^ d tissues. Roots, for 

e^ahiple> take up the substrate very fast in comparison 
^'^^^vrllf ^^fyled^^ which in fact need to be cut or 
?<3¥?^^ "^9 enable' adequate penetration. Therefore it 
p^SJpWt to use histochemical techniques to obtain an 
idea alx)ut the absolute levels of GUS enzyme, let alone 

tpriomoter strength, when comparing different tissues. 
W^9^^P^iV^]^?^ attempts to correlate 

pirectly the a^ blue precipitate with pro- 

P9f??i:^cti vity is the high stability of the ^-glucuronidase. 
|As:^e half-life o GUS in germinating seeds is about 
p8*lfe:(Bustos etal. 1989) it is difficult to determine 
iwhetht^f ^the blue colour detected in seedlings is due to 
^^tpresenbd' of stable enzyme synthesized during seed 
^evelopm^ We tend to attri- 

W^^l ^^ 'O^^ in cotyledons and roots (with 

^e exception of the activity in the root tip; see below) 
H j^oung seedlings to stable enzyme synthesized pre- 



viously during embryogenesis, but cannot exclude reacti- 
vation of the promoter. 



Variation in root tip activity 

In transformed Arabidopsis as well as tobacco plants, 
root tip activity is strong in the early stages of develop- 
ment (1- to 2-week-old seedlings) and ceases upon matu- 
ration. Moreover, plants that already have a well-devel- 
oped root system with secondary roots do not always 
show activity in all root tips. Roots showing root tip 
' activity are morphologically indistinguishable from 
roots lacking this activity. 

The fact that side roots can also have root tip activity 
suggests that this activity is due to de novo synthesis 
and not to a residual or redistributed activity, of j^-glu- 
curonidase synthesized during embryogenesis. 

There are several possible explanations for the root 
tip activity of the USP promoter. First, this activity may 
indicate that the USP gene has a specific, but unknown 
function in root tip tissues. Second, the transient state 
of activity might well be considered as an evolutionary 
relic without detrimental effect, but also without func- 
tional value. Third, the root tip activity of the USP pro- 
moter might be due to the structure of the ch'imaeric 
gene: it is possible that not all of the control elements 
needed for (/SP gene expression reside in the 680 bp 
5'-fianking region. Fourth, the transfer of the chimaeric 
gene to a heterologous host, which possibly lacks silenc- 
ing activities, could also be the cause of the unexpected 
expression. This explanation is less likely, because exper- 
iments with Arabidopsis plants transformed with a chi- 
maeric uidA gene consisting of the 5'-regulatory se- 
quences of the dtS'i gene (one of the four genes coding 
for the small subunit of ribulose-1 ,5-bisphosphate car- 
boxylase in Arabidopsis', Krebbers et al. 1988 b), also 
show root tip activity (W. Boerjan, unpublished results). 
A similar result has been obtained with transgenic Arabi- 
dopsis plants that express a chimaeric at2S-l-gus gene 
(see above) (W. Boerjan, unpublished results), thus 
showing that the expression of certain chimaeric genes 
in root tips might be a common phenomenon and not 
due merely to the transfer from one species to another. 
To exclude the possibility that the root tip activity is 
due to the structure and/or the transfer of the chimaeric 
gene, m situ hybridization experiments need to be carried 
out to determine whether the USP mRNA is present 
in the root tips of Kfaba itself. 

A second question is why not all root tips of a well- 
developed root system show GUS activity. One might 
speculate that the root tips are in a developmentally 
or physiologically different state. The variation in root 
tip activity could reflect a subtle balance in the interac- 
tion between regulatory factors, which may be distorted 
in a fast-dividing tissue, for example as a result of titra- 
tion of regulatory factors. Alternatively, one might argue 
that position effects and copy number differences be- 
tween the seeds of one transformant with multiple copies 
play a role in the variation. This explanation is unlikely, 
because Fj progeny of a transformed Arabidopsis plant! 
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containing one copy of the ats-i gene, which was made 
homozygous for the T-DNA, also show variation in root 
tip activity (W. Boerjan, unpublished results). 

Currently, we are analysing progressive promoter de- 
letions to identify the cis elenients responsible for the 
seed- and root tip-specific expression of the USP gene. 
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Sets of genes improved by directed evolution can be recombined in vitro 
to produce further, improvements in protein function. Recombination is 
particularly useful when improved sequences are available; costs of gen- 
eratiiig such sequences, however, must be weighed against the costs of 
further evolution by sequential random mutagenesis. Four genes encod- 
. mg para-nitrobenzyl (pNB) esterase variants exhibiting enhanced activity 
were recombined in t^^o cycles of high-fidelity DNA shuffling and 
screening. Genes encoding enzymes exhibiting further improvements in 
activity were analyzed in order to elucidate evolutionary processes at the 
DNA level and begin to provide an experimental basis for choosing 
yzfro evolution strategies and setting key parameters for recombination 
iTSi-^*^"^^^ improved variants from the two rounds of DNA 
shuffling confirmed important features of the recombination process- 
rapid fixation and accumulation of beneficial mutations from multiple 
parent sequences as well as removal of silent and deleterious mutations. 
The five to sixfold further enhancement of total activity towards the 
para-nitrophenyl (pNP) ester of loracarbef was obtained through recom- 
bmation of mutations from several parent sequences as well as new point 
mutations- Computer simulations of recombination and screening iUus- 
trate the trade-offs between recombining fewer parent sequences (in 
order to reduce screening requirenients) and lowering the' potential for 
further evolution. Search strategies which may substantially reduce 
screening requirements in certain situations are described. 

© 1997 Academic Press Limited 

, Keywords: directed evolution; DNA shuffling; random mutagenesis; para- 
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introduction 

Enzymes can be evolved in vitro to exhibit new 
and useful functions. A sampling of the local 
sequence space of the enzyme is created by muta- 
genesis; screening or selection directs the evolution 
towards tfie desired features. A successful strategy 
for improving enzyme activity in non-natural 
environments (Ghen & Arnold, 1993) and on non- 
natural substrates (Moore & Arnold, 1996) has 
been to accumulate amino add substitutions oyer 
multiple generations of random mutagenesis and 
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screening. In practice, the best variant identified in 
each generation is chosen to parent the subsequent 
generation. Other potentially useful variants are set 
aside, and their mutations must be rediscovered in 
the evolved protein background in order to 
become incorporated. Because there is no mechan- 
ism other than back mutation for deleting 
mutations, this approach can also accumulate dele- 
terious mutations, leading to premature termin- 
ation of an evolving lineage. These are the classical 
arguments for the benefits of recombination (sex) 
in evolution (Maynard Snuth, 1988). Recombina- 
tion allows more rapid accumulation of beneficial 
mutations present in a population. It also makes 
possible the removal of deleterious mutations 
which would otherwise accumulate in an asexual 
population, a phenomenon known to geneticists as 
Muller's ratchet (Miiller, 1932). Recombination can 
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provide similar benefits for in vitro molecular evol- 
ution (Stemmer, 1994a,b). 

Bacillus siibtilis /?-nitrobenzyl (pNB) esterase cata- 
lyzes the hydrolysis of the para-nitrobenzyl .esters 
of various cephalosporin- type antibiotics, a necess- 
ary step in their large-scale synthesis (Zock et ai, 
' 1994). Using four generations of sequential random 
mutagenesis and screening, we evolved a series of 
pNB esterases up to 30 times more active towards 
hydrolysis of the pNB ester of loracarbef (LCN- 
pNB) in aqueous dimethylformamide (Moore & 
Arnold, 1996). During the fourth generation, a 
large number (~7500) of pNB esterase clones were 
screened and partially characterized in. order to 
validate the rapid screening assay. Sixteen 
improved pNB esterase clones were identified, 
from which the five most active enzymes (>50% 
enhancements in activity over, the parent enzyme) 
were characterized. DNA sequencing revealed four 
unique pNB esterases (Table 1). Due to the limi- 
tations of screening, evolved sequences are gener- 
ated using a low rate of point mutagenesis and 
typically accumulate a single beneficial mutation 
per generatiori. A simple restriction /ligation exper- 
' iment demonstrated that recombination of 
rnutations present in at least two of those 
sequences could further improve pNB esterase 
activity- Recombining gene segments from two 
unproved pNB esterase variants yielded an 
enzyme twice as active as the best parent. DNA 
sequencing demonstrated that mutations from each 
of the two parents were combined in the new 
sequence (I60V and L334S), while one neutral or 
slightly deleterious mutation was deleted (K267R; 
Moore & Arnold, 1996). 

Stemmer recently introduced the technique of 
"DNA shuffling" to create novel genes by recombi- 
nation of closely-related DNA sequences (Stemmer, 
1994b). Because it also introduces new point 
mutations during reassembly of the DNA frag- 
ments, DNA shuffling alone has been effective for 
directed protein evolution starting from a single 
sequence (Stemmer, 1994a; Crameri et al., 1996). 
Questions arise as to how this approach is best 
implemented and integrated with other in vitro 
-evolution approaches such as sequential random 
mutagenesis. Issues include optimizing the point 
mutagenesis rate associated with DNA shuffling, 
determining appropriate screening sample sizes 
and how many parental genes to recombine, and 
deciding when to lise recombination. Here we 
investigate the further evolution of pNB esterase 
by DNA shuffling of the improved sequences gen- 
erated by random mutagenesis and screening. By 
following how the genes evolve during cycles of 
DNA shuffling and screening, we can elucidate the 
mechanisms contributing to the evolution of func- 
tion and begin to optimize strategies for in vitro 
evolution. An analysis of the recombination pro- 
cess identifies some of its benefits and limitations 
for directed evolution and allows a rational choice 
of mutagenesis and screening strategies. 



Results and Discussion 

Recombination statistics and 
screening requirements 

To comment on the utility of DNA shuffling in 
directed evolution, a review of the statistics of 
recombination of multiple parent sequences is use- 
ful. For this discussion, we will assume that the 
mutations are unique and distributed far enough' 
from one another on the genes that recombination 
occurs freely between any two. Furthermore, equal 
amounts of the initial DNA sequences are recom- 
bined. Consider the random recombination of 
three parent sequences, each of which contains a 
single mutation. Any given mutation will be incor- 
porated into a progeny sequence with a probability 
of 1/3; the probability of generating the wild-type 
sequence is 2/3 at each mutation site. This high- 
lights an important consequence of shuffling mul- 
tiple sequences: there is a statistical preference for 
the absence of mutation in the progeny. The over- 
all probability of picking a completely wild-type 
sequence from the recombined library is (2/3)^ 
=:= 0.30. The probability of generating a sequence 
containing a single mutation (a parent sequence) is 
1/3 X (2/3)2 = 0.15. Because there are = 3!/l!2!, 
or three such sequences, the overall fraction of 
parent sequences in the library is 0,45. Thus fully 
75% of the sequences in the recombined library are 
variants already in the evolutionist's possession. 

In general, for a recombination system consisting 
of N sequences and M total mutations, the prob- 
ability of generating progeny sequences containing 
H mutations equals the number of ways a 
ji-mutation sequence can be generated {C^) multi- 
plied by the probability of generating any single 
|i-mutation sequence: 

vr 

Figure 1 summarizes the analysis for recombina- 
tion of single-mutation parent sequences (N = M). 
The probability that recombination will return the 
zero-mutation "grandparent" or single-mutation 
parent sequences remains constant between 73 
and 75%; only '^25% of the clones screened have 
sequences that have not already been examined. 
The probability of creating individual sequences 
declines dramatically with increasing numbers of 
parents. The least frequent sequences are those 
containing the majority of mutations from the 
parent population, and the sequence containing 
all the mutations (|i = M) is of course the rarest. 
The probability oi generating the rarest 

sequence is 

Because we are interested in the evolution of 
function, we need consider only those mutations 
responsible for functional differences among pro- 
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Figure 1. Probabilities of generat- 
ing sequences containing different 
numbers of mutations by random 
recombination, based on re- 
combining single-mutation parent 
sequences. Novel variants (not 
grandparent or parent sequences) 
are shown with unfilled symbols. 



tein Variants. Neutral mutations by definition do 
not affect function; their distribution among pro- 
geny sequences is determined statistically, even in 
the screened population (Zhao & Arnold, 1997b). 
Thus for the purposes of this discussion of recom- 
bination libraries and screening requirements, M is 
the riximber of mutations that affect the targeted 
function (either beneficial or deleterious).! By 
screening enough clones to ensure that the rarest 
sequence, that is, containing all M mutations, has 
been examined, one can be sure that the best var- 
iant will be discovered. This is true even if the best 
variant does not contain all the functional 
rhutations (as would be expected if some 
mutations were deleterious or if the effects of 
mutations are not cumulative). 

In practice, of course, oversampUng is required 
to ensiu-e that a particular variant has been exam- 
ined during the course of screening. To be 95% 
confident that the most active combination variant 
has. been examined, we must be 95% confident the 
rarest variant has been examined. If S is the num- 
ber of clones sampled, then 

(1 — Pm)^ < 1 — confidence limit 

describes how the probability of not sampling the 
rarest variant changes with increasing S. This 
allows calculation of the number of samples 
required for a given confidence limit The oversam- 
pling is then how many more samples must be 
screened over the theoretical minimum. When one 
clone is required with 95% confidence, the over- 
sampling will be between 2.6 and 3.0 (for larger 
numbers of parents). Even a relatively low rate of 
background point mutagenesis, however, can 
introduce significant confounding effects. Non-neu- 
tral point mutations obscure recombination events 



-. t A mutation that is neutral in one context (i.e. in the 
wild-type background), but becomes functional in a 
different context, would be considiered a functional 
mutation. 



and increase the amount of screening required to 
find the best sequences {vide infra). Thus, in prac- 
tice, it may be impossible to screen sufficient num- 
bers of clones to be sure of finding the best 
recombinant, particularly when the point mutation 
rate is high and a large number of functional 
mutations are being recombined. Alternative strat- 
egies which can reduce screening requirements 
under special conditions will be discussed further 
on. 

DNA shuffling of evolved pNB esterases 

An effect of forcing DNA polymerase to syn- 
thesize full length genes from the pool of small 
DNA fragments generated during DNA shuffling 
is additional backgroimd point mutagenesis. A high 
rate of point mutagenesis can severely inhibit 
the discovery of novel combinations of existing 
mutations within a population. Because most 
mutations are deleterious (in a screening assay sen- 
sitive to small changes in the screening variable), 
beneficial recombinations and rare beneficial point 
mutations are masked by the negative background. 
DNA shuffling with a 0.7% riiuta genesis rate, for 
example, would yield an average of 10-11 point 
mutations in the 1470 bp pNB esterase gene. This 
is substantially more than the optimal mutation 
frequency (~three mutations per gene) for directed 
evolution of pNB esterase (Moore & Arnold, 1996). 
In fact, when the four evolved pNB esterase gene 
isequences were shuffled using Taq polymerase, 
fully 90% of the clones in the resulting library 
exhibited essentially no esterase activity during 
screening (data not shown). In a parallel study, we 
observed that 80% of the clones generated by DNA 
shuffling of subtilisin E exhibited no activity (Zhao 
& Arnold, 1997a). 

In an. effort to reduce the background mutagen- 
esis rate, a proofreading polymerase (Pwo) was 
used during fragment reassembly. With Pwo, 50 to 
100 base-pair fragments could be reassembled to 
create a library in which fully 80% of the clones 
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Figure 2. Activity profiles of generations 5 and 6 deter- 
mined by screening libraries created by DNA shuffling 
of unique fourth and fifth generatiori variants. Activities 
were sorted from best to worst. Profiles are normalized 
by the number of clones screened. . 



retained activity. Inserts from 13 randomly picked 
colonies were partially sequenced in order to deter- 
mine the point mutation rate. Five mutations not 
present in any of the parent sequences were found 
in 12,000 nucleotides sequenced, for an overall 
mutagenic rate of --0.04%. Theise minimally muta- 
genic conditions were used for DNA shuffling. 
A subsequent, in-depth investigation of the various 
steps involved in DNA shuffling has allowed us to 
identify a set of recombination protocols with a 
wide range of point mutagenesis rates (Zhao & 
Arnold, 1997a). 

Four unique fourth generation improved pNB 
esterase variants were chosen as the starting point 
for further directed evolution by DNA shuffling. 
Two cycles of DNA shuffling and screening for 
activity towards the p-nitrophenyl ester of loracar- 
bef (pNP-LCN) in 25% dimethylformamide (DMF) 
were performed. The activity profiles of the result- 
ing populations (generations 5 and 6) are shown 
in Figure 2. To generate these profiles, activities 
of the individual clones measured in the 96-weIl 
plate screening assay were normalized by cell 
density (A^) and plotted in descending order. 
Approximately 2% of the 948 generation 5 clones 
screened exhibit more total activity than the most 
active parent (4-54B9). The screened population 
was sufficiently large to give a high level of con- 
fidence that the most active variant that can be 



t When shuffling four parent sequences each of which 
contains one beneficial mutation, 765 clones must be 
screened, to be 95% confident that all combinations have 
been examined (assuming recombination occurs freely 
between mutations and no point mutagenesis). A 0.04% 
rate of point mutagenesis translates to less than 0.6 new 
mutations per sequence, of which only a fraction will 
affect function (estimated from the activity profile of a 
library created by error-prone PCR to be -^0.5, data not , 
shown). 




Generation 

Figure 3. Activities of fourth, fifth and sixth generation 
pNB . esterase variants (Table 1) in screening assay. 
Fourth generation variants were recombined and 
screened to identify improved enzymes in generations 5 
and 6. 



generated by simple recombination of the fourth 
generation sequences has been found. t The six 
most active variants from generation 5 were col- 
lected and shuffled again to create generation 6. 
Fully 20% of the 474 clones screened were more 
active than 4-54B9. Only 20 to 25% of the clones 
were inactive, as expected using the high fidelit}'^ 
Pwo-only shuffling conditions. 

Figure. 3 summarizes the activities of the four 
fourth generation parents and the best variants 
identified in generations 5 iand 6. The improvement 
in enzyme activity as a result of shuffling is 
already apparent in the fifth generation, which 
includes one variant (5-6C8) fourfold more active 
than 4-54B9 and twice as active as variant 5-1 Al 2 
previously generated by ligation recombination 
(Moore & Arnold, 1996). The sixth generation con- 
tains two clones with yet higher activities than 5- 
6C8. The best one, 6-1 OFl, represents a five to six- 
fold improvement over 4-54B9 and is -^150 times 
more active than the wild-type. . 

Activities of the fifth and sixth generation var- 
iants towards the p-nitrobenzyl ester of loracarbef 
(LCN-pNB) were also determined, using a modi- 
fied FIPLC assay as described in Materials and 
Methods. The best pNB esterase is 5-6C8, which 
exhibits a threefold increase in total activity over 4- 
54B9. This clone is now -^100 times more active 
than wild-type pNB esterase towards LCN-pNB in 
25% DMF. The sixth generation variants exhibited 
no further improvement in activity towards this 
substrate, a clear reflection of the use of the pNP 
ester during screening and the first law of random 
mutagenesis: ''You get what you screen for" (You 
& Arnold, 1996). 
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Analysis of evolved pNB esterase genes 

DNA mutations present in the four parent fourth 
generation sequences and mutations identified by 
sequencing the genes encoding the selected fifth 
and sixth generation variants are summarized in 
Table 1. By comparing the activities and sequences 
of these variants with the third-generation parent, 
four beneficial mutations were identified (leading 
to amino acid substitutions 160V, L334V, L334S 
and A343V). The remaining mutations present in 
the fourth generation sequences are neutral or 
mildly deleterious (Moore & Arnold, 1996). 

Several interesting observations can be made 
from this Table. It can be seen that a number of 
mutations increase their frequencies in the sub- 
sequent generations. Substitutions I60V in 4-38B9 
and L334S in 4-54B9 are each present in a single 
fourth generation parent. In contrast, I60V is pre- 
sent in five of the six fifth-generation variants, and 
L334S is present in all six. By the sixth generation 
both substitutions are fixed in the population. A 
new substitution at position 317, first found during 
the fifth generation (5-6C8), also becomes fixed by 
the sixth. This new mutation probably accounts for 
the significant increase in activity of variant 5-6C8. 
The P317S isubstitution is positioned near the 
enzyme surface in a loop located on the same side 
of the entrance to the substrate binding pocket as 
anuno acid substitutions L334S, M358V and A343V. 
(Moore & Arnold, 1996). Removal of a proline at 
this position may relax conformational constraints 
• on the loop, allowing the substrate freer access to 
the active site. 

The two separate beneficial mutations at position 
334 in 4-43E7 and 4-54B9 are mutually exclusive, 
and a competition exists as to which one will be 
propagated to successive generations. Variant 4- 
54B9 has more than twice the activity of 4-43E7 as 
a result of the. mutation at position 334, and the 
fifth generation recombination progeny in fact 
show the L334S substitution from 4-54B9 exclu- 
sively. Recombination provides a rapid means to 
identify the most effective mutation among mul- 
tiple possibilities at any given site. 

Related to the observation that beneficial 
mutation combinations are fixed is the fact that 
recombination and screening also effectively 
remove neutral and deleterious mutations. Three 
of the five mutations present in the fourth gener- 
ation parents that are synonymous (DNA 
mutations in codons 33, 84, and 239 that do riot 
lead to amino acid substitutions) or non-synony- 
mous, but believed neutral or mildly deleterious 
in their effects on total activity (mutations leading 
to amino acid substitutions S94G and K267R 
(Moore & Arnold, 1996)), have been removed 
from the improved pNB esterase population in a 
single round of shuffling; all five are removed by 
the sixth generation. The two most active sixth 
generation enzyme variants, 6-lOFl and 6-1D12, 
have no synonymous mutations at aU and only one 
mutation (at position 359) not seen in any previous 



clone. Due to the statistical preference for the 
absence of mutations the recombination process is 
highly effective in filtering out neutral (and deleter- 
ious) mutations starting from multiple parent 
sequences. 

Table 1 also shows that the DNA shuffling tech- 
nique can recombine multiple parent sequences to 
create novel progeny. Recombination between at 
least three fourth-generation parents is required to 
create 5-5E4, and at least three fifth-generation 
parents were recombined to generate clones 6-1 OF! 
and 6-1A6 (based on the presence and absence of 
the DNA mutations in the sequences compared to 
the parent sequences). 

Finally, it is useful to note that DNA shuffling 
generates point mutations that are rarely observed 
during PCR (at least for the low-mutagenesis rate 
PCR conditions used for directed evolution of 
longer DNA sequences). Four of the 12 new point 
mutations identified in the fifth and sixth gener- 
ation variants, for example, are G C (and 
C -7^ G) and G ^ T (and C A) transversions, 
which were not found at all during the first four 
generations of pNB esterase evolution involving 
PCR . mutagenesis (Moore & Arnold, 1996). These 
mutations were' also generated very rarely during 
the error-prone PCR mutagenesis of subtilisin 
(Shafikhani et al, 1997). DNA shuffling and error- 
prone PCR together may provide access to a wider 
range of amino acid substitutions. 

Evolved pNB esterase amino acid sequences 

Amino acid substitutions in the evolved pNB 
esterases are indicated in Table 1; changes in 
amino acid sequence along the lineage are sum- 
marized in Figure 4. The accumulation and fixation 
of two beneficial amino acid substitutions from the 
fourth generation, I60V and L334S, is essentially 
complete in a single generation of DNA shuffling 
and screening 948 clones. In contrast, A343V, a 
beneficial mutation found in the fourth generation, 
no longer appears in the majority of fifth or sixth 
generation variants. The (5-4H4) recombinant of 
the parent containing this mutation (4-53D5) with 

4- 54B9 shows no irnprovement in activity over 4- 
54B9 (Figure 3). Substitutions A343V and L334S 
therefore do not work in concert to improve 
enzyme activity, and consequently there is little or 
no driving force to retain A343V in the population. 
The remaining fifth generation variants, with the 
exception of 5-6C8, are less active than 5-1A12 
(Figure 3), yet they contain the I60V and L334S 
substitutions while omitting K267R, as does 5- 
1A12. This suggests that the additional mutations 
foimd in those sequences are neutral, or possibly, 
deleterious. For instance, the amino acid sequences 
of 5-5E4 and 5-1A12 are identical, and the 
decreased activity of the former is likely due to the 
two synonymous mutations in 5-5E4 not present in 

5- 1A12. Because the screen evaluates the total 
activity of a clone (normalized by cell density), 
synonymous mutations can influence the result, for 
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Figure 4. Lineage of pNB esterase 
variants showing amino acid sub- 
stitutions accurnulated by four gen- 
erations of sequential random 
mutagenesis (fourth generation) 
and by DMA shuffling (fifth and 
sixth generations) and screening. 
All variants contain amino acid 
substitutions H322R, Y370F, M358V 
and L144M from the third gener- 
ation parent' (Moore & Arnold, 
1996). 



example, by affecting the amount of active enzyme 
expressed. The new beneficial mutation that gives 
rise to the P317S substitution becomes fixed in the 
sixth generation, and further evolution during that . 
generation primarily arises from point mutation 
rather than recombination. 

Glories 5-4G2 and 5-4D12, whose DNA 
sequences are identical, both contain amino acid 
substitutions H356R and I464V. These two substi- 
tutions are seen together again in 6-1 OFl and 6- 
1C7. Because 6-lOFl and 6-1D12 have almost iden- 
tical activity^ we can reasonably infer that the I60V, 
P317S, and L334S substitutions are responsible for 
that activity, while the mutations leading to H356R 
and I464V from the fifth generation as well as a 
new mutation, T359A, in 6-1D12 are neutral. The 
three mutations believed responsible for enhanced 
activity are also present in 6-1A6, along with the 
last mutation in this system known to erOiance 
activity, A343V. That 6-1 A6 has lower activity than 
6-lOFl and 6-1D12 is therefore attributable to either 
the three synonymous mutations in 6-1 A6 (Table 1) 
or antagonisrn between amino acid substitutions 
A343V and P317S or 160V. 

The new point mutations that arose during the 
minimally mutagenic DNA shuffling increased 
(P317S) and decreased enzyme activity. The effects 
of individual mutations can be ascertained with 
confidence because the sequences differ from one 
another at very few positions. We have recently 
demonstrated a method that allows one to dis- 
tinguish clearly beneficial, neutral and deleterious 
mutations in evolved sequences by random recom- 
bination with ancestor sequences (Zhao & Arnold, 
1997b). This method will be particularly useful for 
identifying mutations responsible for functional 
changes in proteins in a background of neutral 



mutations (as ' happens when multiple new 
mutations are present). 

Only 2% of the fifth generation clones are more 
active than the most active parent, 4-54B9 
(Figure 2). Although 25% of the progeny should be 
novel, the combination I60V + L334S predominates 
in the most active variants (Figure 4), suggesting 
that many of the remaining combinations lead to 
lower activity than in 4-54B9. Additionally, while 
there is no mechanism for recombination alone to 
generate inactive clones, -^25% of the variants in 
Figure 2 are inactive, presumably as a result of 
background point mutation. This implies that the 
frequency of enhanced-activity recombinants is 
reduced by point mutation and emphasizes the 
importance of minimizing the mutagenesis rate 
when recombining positive mutations. 

Developing strategies for directed evolution 

Recombination versus random mutagenesis 

Recombination is only useful if a population of 
sequences is available from which new combi- 
nations of mutations can be generated. Homolo- 
gous, proteins with similar sequerices could 
provide such a starting population (Stemmer, 
1994b). (Note, however, that a high level of 
sequence identity may be required for DNA shuf- 
fling.) Populations of sequences can also be created 
by the background point mutagenesis feature of 
DNA shuffling (Crameri et al, 1996). Alternatively, 
they can be generated by random mutagenesis and 
screening experiments, as they have been for the 
current study. When interesting sequences already 
exist, recombination offers -an efficient means to 
use that information. If the sequences must be gen- 
erated, however, then one should consider that 
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cost in the overall cost of evolution by recombina- 
tion as compared to, for example, evolution by 
sequential generations of random mutagenesis and 
screening. 

In theory, the sequential (or "asexual") approach 
requiring the least labor in terms of screening is to 
screen randomly mutageiiized clones until a posi- 
tive is identified and then use that as the ternplate 
for the next generation. The process is a random 
walk in which the first upliill step encountered is 
taken. To take a simple illustration^ consider three 
mutations A, B and C that each contribute in a 
cumulative, if not additive, manner when com- 
bined. A, B and C could be collected in the ABC 
variant in three sequential generations of mutagen- 
esis and screening. Alternatively, if A, B and C all 
contribute to the desired feature in the wild -type 
background (as they often do; see, for example, 
Chen & Arnold, 1993), they could be found separ- 
ately and then recoimbined to make ABC. Finding 
the single-mutation sequences A, B, and C, how- 
ever, requires screening the same number of colo- 
nies as finding ABC by sequential evolution. 
Recombining the A, B, and C sequences to make 
ABC requires additional screening. Of course, the 
sequential pathway requires three random muta- 
genesis steps, while the recombination pathway 
requires only one mutagenesis step and one DNA 
shuffling step. The advantages of one approach 
over the other then depend on the costs of screen- 
ing relative to the DNA manipulations. 

Note that the severe limitations screening places 
on the number of colonies that can be sampled 
makes it difficult to accept downhill steps in the 
hope that further improvements can be found 
further out in sequence space (Moore & Arnold, 
1997). It also means that extremely rare events 
such as the recombination of neutral or slightly 
deleterious mutations to make a beneficial combi- 
nation will probably not contribute in any signifi- 
-cant fashion to the evolutionary process. 

The pNB esterase evolution provides a concrete 
example for analysis. Approximately one in every 
1500 to 2000 randomly mutagenized pNB esterase 
clones screened was positive (showing 50% or 
greater enhancement in activity over the parent; 
Moore & Arnold, 1996). To generate the population 
of four unique positives for DNA shuffling, we 
examined a total of 7500 clones. Finding the best 
combination variant required additional DNA 
shuffling experiments, and ^^1400 additional colo- 
nies were screened. Thus a total of 9000 clones 
were screened in going from generatior\s 3 to 6. 
There is no guarantee that the sequences chosen 
for recombination are unique: in fact, the original 
fourth generation clones contained five variants, 
two of which were identical (4-38B9 and 4-54B9) 
and two of which contained mutations in the same 
codon (4-43E7 and 4-54B9), precluding recombina- 
tion between these variant pairs. It is very likely 
that variants of comparable or even greater activity 
could also have been created by. continuing ran- 
dom mutagenesis and screening for three gener- 



ations from the first fourth generation variant 
identified. The total screening requirement would 
be the same. 

In practice, however, the uphill climb often 
involves identification of multiple positives during 
each generation. Everything but the one chosen to 
parent the next generation is discarded in the ran- 
dom uphill walk of the "asexual" evolution. 
During the pNB esterase evolution, we often ident- 
ified four or five potential positives during the 
rapid screen on the LCN-pNP colorimetric sub- 
strate. Those were either verified or not during a 
second level screen on the /j-nitrobenzyl (LCN- 
pNB) substrate, and it was often the case that more 
than one sequence was a true positive (Moore & 
Arnold, 1996). The other improved sequences 
could of course be collected and recombined at any 
time and at relatively little screening cost. A signifi- 
cant advantage of the DNA shuffling method is its 
ability to utilize these available positive sequences. 

Computer simulations of random recombination 
and screening 

The statistical model can be used to optimize the 
number of parent sequences chosen for DNA shuf- 
fling. Screening during the fourth generation actu- 
ally resulted in the identification of 16 clones 
measurably more active than the parent, of which 
five were at least 50% more active (Moore & 
Arnold, 1996). An attempt to recombine all 16 
sequences yielded no clones more active than 4- 
54B9 (-^1000 clones screened). This result can be 
understood when we consider the dramatically 
lower probability of finding the best combi- 
nation(s) as the number of sequences increases. If 
the screening sample size is limited to a few thou- 
sand clones, there is littie chance that the best 
sequences, or even sequences better than the best 
parent, will be found by Stcreening a library created 
from 16 parents. 

We have used a computer simulation of the ran- 
dom sampling of the two recombined libraries 
obtained by shuffling five and ten sequences to 
illustrate the advantage of choosing fewer parents 
when screenii\g is limited. Recombining all ten 
parents becomes advantageous, however, when 
large nuihbers of clones can be examined. (Of 
course, the larger sampling requirement should 
then be compared to the potential for continued 
evolution by random mutagenesis.) Assuming that 
the ten parent sequences each contain a unique, 
single beneficial mutation (N = M) and that they 
can be recombined to give all possible combi-' 
nationsy we calculated for |i = 0 through 10. 
Since E — 1, these were organized into a cumu- 
lative distribution from 0 to 1, and a random num- 
ber generator was used to pick a point on the 
cumulative distribution, thereby identifying |i 
(number of mutations per sequence). A second ran- 
dom number generator was used to pick one of the 
possible sequences containing ji substitutions 
using an evenly spaced distribution of possible 
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combinations. The activity of the sequence chosen 
was then calculated by assuming that the free ener- 
gies of activation of the variants (proportional to 
the natural logarithms of their activities) are addi- 
tive. 

The results of this simulation are "shown in 
Figure 5, using the activity data from the fourth 
generation pNB esterase variants. Figure 5(a) 
shows the averages of the highest values of mutant 
activities obtained over 15 separate trials for each 
(screening) sample siz,e. The results obtained by 
shuffling the ten best mutants (black diamonds) 
can be seen to be slightly worse than those 
obtained by shuffling the five best mutants (white 
. squares), for sample sizes up to about 10,000 to 
15,000. That is, the average expected best mutant is 
higher for shuffling five parents at a time for small 
sample sizes. Figure 5(b) and (c) show the range of 
values of the highest mutant activity obtained on 
.each of 15 separate trials for each sample size. 
Here, the highest values obtained from recombin- 
ing the best' ten variants (black diamonds) become 
better than the values obtained from shuffling the 
best five (white squares) at sample sizes greater 
than about 1000. Although shuffling the top ten 
mutants for this set of data can yield higher final 
activities, fhe simulation shows that the outcome is 
much more risky when screening capabilities are 
limited to a few thousand clones. 

Simulations also show that the results of the 
compaiisdn of shuffling five versus ten parents is 
highly sensitive to the values of the activities. For 
instance, if the activities of mutants 6 through 10 
are decreased, then the sample size at which 
recombining all ten mutants becomes preferable 
becomes much higher. Moreover, the simulation 
can be adapted for cases in which some or all of 
the parent sequences have two or more mutations, 
which may or may not be recombinable. Thus this 
simulation approach can be used to determine the 
optimal number of sequences to recombine for any 
given set of activity values and any given sample 
size. 

The simple additivity asstmiption on which 
these simulations are basedf is a reasonable first 
approximation of the behavior of combined 
mutations in proteins (Wells, 1990) and is useful 
for a first exploration of strategic issues in in vitro 
protein evolution. The real behavior is often more 
complex and will depend on the property of inter- 
. est as well as the particular protein. However, it is 
likely that deviations from simple additivity are 
governed by non-Hnear functions of the number 
and magnitude of changes; values will certainly 
depend on which subset of mutations is recom- 
bined. While it is possible to modify the simulation 
to take into accoimt deviations from additivity, 
very littie data are available on the effects of large 
nimibers of mutations. We have therefore not 



t Both beneficial and deleterious mutations can be 
accommodated in this framework. 
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Figure 5. (a) Averages of highest values of mutant 
activities obtained over 15 separate trials of simulated 
random recombination of five and ten parent sequences, 
(b) and (c) Range of values of mutant activities obtained 
over 15 separate trials. Activities of best fourth-gener- 
ation parent (4-54B9) and highest-activity fifth gener- 
ation clone identified (5-6C8) are indicated for 
comparison. 
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Figure 6. Pairwise recombination can reduce screening 
requirements, provided effects of mutations are cumu- 
lative. By shuffling two sequences at a time, sequences 
containing two mutations represent 25% of the recom- 
bined library. This example involves six recombination 
experiments. 



attempted to include deviations from additivity in 
the current simulations. Figures 5(a); (b), and (c) 
show the activities of the best fourth generation 
parent (4-54B9) and the best fifth generation clone 
identified (5-6C8) by screening the shuffled library. 
That the activity of 5-6C8 is --twofold less than the 
average expected for screening 948 clones reflects 
the fact that (i) only four of the original five posi- 
tive clones identified during generation 4 were 
unique, (ii) two mutations were on the same codon 
and could not be recombined, and (iii) the 
mutations combine with significantly less than 
100% additivity. 



Alternative search strategies 

Finally, we v^ill briefly consider two other search 
strategies that rnight be used to minimize screening 
requirements. One approach to producing a mul- 
tiple-mutation variant which requires the screening 
of far less clones is multiple-step pairwise recombi- 
nation. This strategy is illustrated in Figure 6 for 
the simple case of recombining four (beneficial) 
mutations from four separate parents. Pairs of 
parents are mated. As each progeny is a double 
mutant 25% of the time, only 12 clones are 
required to find all the double mutants, assuming 
the effects of the mutations are cumulative. The 
double mutaints are then similarly mated, and 
screening only eight clones will identify the triple 
mutants. Mating and screening four clones wiU 
generate the quadruple mutant. Thus a total of 
only 62 clones (24 x 2.6 times oversampling to be 
95% confident at each step) must be screened, as 
compared to the 765 required to generate the quad- 
ruple mutant in a single recombination step. Such 
an approach requires considerable DNA manipu- 
lation and would be most useful when screening is 
extremely difficult. (An attractive alternative at this 
point may be sequencing the parents and recombi- 
nation by site-directed mutagenesis.) A further cost 
of this approach is that the search space is very 
limited. The assxmiption is . that each activity- 
enhancing mutation wUl contribute to the overall 
activity, so that the quadruple mutant is the best 
performer of this population. If a particular double 
or triple mutant is the best performer, it may or 



may not be found, since not all of these intermedi- 
ate mutants will have been examined. 

A compromise method that works well, at least 
in theory, can be described as "population recom- 
bination." Tlie idea is to shuffle all four parent 
sequences at once and screen enough clones to see 
all the double mutants. Because each double 
mutant occurs 3.5% of the time, 28 clones must be 
screened. This examines all of the pair-wise inter- 
actions between mutations and eliminates those 
which are not cumulative. Tlie double mutant 
population is recombined to produce all of the tri- 
ple mutants and the quadruple mutant (requires 
screening 16 clones). If the mutations w^ere at least 
cumulative in their effects, screening 132 (44 x 3.0 
times oversampling) clones would search the space 
completely for the best (quadruple) mutant. This 
approach most closely describes how recombina- 
tion/selection experiments operate (Stemmer, 
1994a) where all of the clones that survive a par- 
ticular selection criterion are recombined (often 100 
clones or more serving as the parent population for 
the next generation). 



Conclusions 

Recombination is an important tool for directing 
the evolution of proteins. Beneficial mutations can 
be recombined, while neutral and deleterious 
mutations are eliminated. The need to screen rather 
than select for miany important enzyme functions, 
however, severely limits the ability to search for 
useful combinations. It is therefore imperative to 
analyze various recombination strategies. Muta- 
genic rates associated with the recombination pro- 
cess must be low so that beneficial mutations are 
not lost in a background of deleterious ones. 
Although a new beneficial amino acid substitution 
was found as a result of the DNA shuffling of pNB 
esterase, DNA shuffling may be less efficient for 
discovery of new mutations compared to a con- 
trolled mutagenesis technique (a beneficial 
mutation can be masked in the background of 
recombined sequences). Utilizing more than two 
parents for recombination introduces a statistical 
preference for not incorporating mutations in pro- 
geny, and this has several consequences especially 
with respect to screening. Recombination favors 
the. dilution of progeny containing the most 
mutations, which has the effect of exponentially 
increasing the nimiber of progeny that must be 
screened in order to find the rarest ones. Because 
shuffling large numbers of parent sequences can 
yield many possible combinations, it may also be 
necessary to strictly limit the number of parent 
sequences in any given recombination experiment. 
We have described two alternative search strat- 
egies which reduce the required number of var- 
iants examined, at the cost of possibly missing 
intermediate beneficial combinations. 

Finally, recombination requires a population of 
positive variants for efficient enzyme improve- 
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ment. If a population of positive variants must first 
be generated, sequential random mutagenesis may 
require less effort to produce sequences containing 
multiple mutations- Multiple positive variants are 
often generated, however, during a single cycle of 
random mutagenesis and screening. Recombinar 
tion of these positives can provide substantial 
improvements at relatively little cost. 



Materials and Methods 

DNA shuffling 

DNA shuffling was performed as described by 
Stemmer (1994b) with modifications. The 2 kb DNA frag- 
ment encoding the B. siibtilis pNB esterase gene was 
amplified using PCR (forward primer 5'-CAATCTA- 
GAGGGTATTAATAATG-3' and reverse primer 5'- 
CGCGGGATCCCCGGGTACCGGGC-3'). The amplified 
DNA was purified by gel electrophoresis and extraction 
using Qiaex kit (Qiagen, Chatsworth, CA). A total quan- 
tity of ~10 ng DNA, either from a single parent (non- 
recombinatorial) or from a mixture of multiple parent 
sequences (recombinatorial), was digested with DNase 1 
(0.0015 units/ }il) at room temperature for 20 minutes in 
a . 100 ^1 reaction. After ethanol precipitation, the digested 
DNA was electrophoresed as a smear in a 3% low melt- 
ing temperature gel of NuSieve GTG Agarose (FMC Bio 
Products, Rockland, ME). DNA fragments in specified 
molecular size ranges were collected onto DE-81 filter 
paper disks (Whatman, Maidstone, England) by electro- 
phoresis and eluted from the filter paper with 400 pi of 
id mM Tris/1 mM EDTA buffer (pH 8.0) containing 1 M 
NaCl. The DNA fragments were ethanol precipitated 
and redissolved to approximately 20 ng DNA/|il in 
1 X Fwo DNA polymerase buffer (Boerhinger Man- 
nheim, Indianapolis, IN) contairung 2 mM MgS04 and 
0.2 mM each of the four dNTPs. A 5 unit/^l Pwo DNA 
polymerase solution (Boehringer Mannheim) was diluted 
tenfold^ and 0.5 jil was added to a 5 |il redissolved DNA 
reaction solution. Reassembly of DNA fragments was 
conducted by PCR, using the conditions 94''C for 40 
seconds., then 70 cycles of 94°C for 30 seconds, 50*'C for 
30 seconds, 72°C for 30 seconds, followed by a final 
extension, step, at 72°C for five minutes. A second 0.5 |il 
of Pwo polymerase was added at the annealing step of 
the 35th cycle. The reassembled DNA fragments were 
amplified in a conventional PGR (25 cycles) with the 
dilution of 1 nl reassembled DNA fragments in a 100 pi 
reaction. Once the success of the reassembly /amplifica- 
tion reactions was verified by gel electrophoresis, the 
reassembled product was purified with a Wizard PCR 
prep kit (Promega Corp., Madison, Wl); digested with 
BamHl and Xbal, concentrated by ethanol precipitation, 
and electrophoresed in an agarose gel. The 1.8 kb pro- 
duct was cut from the gel and the DNA extracted using 
a Qiaex kit. The final products were ligated with the vec- 
tor generated by BamHI-Xbal digestion of pNB106R 
(Zock et al, 1994). This library was used to transform 
competent E. coli TGI cells, as described (Moore & 
Arnold, 1996). 

Screening a pNB esterase librjary 

Screening was based on the assay described pre- 
viously (Moore & Arnold, 1996), using the p-nitrophenyl 



ester of the loracarbef nucleus (LCNl-pNP) as substrate. 
£. coli TGI containing the plasmid library were grown 
on LB/tetracycline (20Mg/ml) plates. After 36 hours at 
30'^C single colonics were picked into 96-welI plates con- 
taining 100^1 LB/tetracycline medium per well. These 
plates were shaken and incubated at 30 C for 1.2 hours 
to let the cells grow to saturation. Aliquots (20 ^1) of the 
cultures were inoculated into a fresh plate containing 
100 nl media per well; these were incubated al 40' C for 
ten hours with shaking to induce the expression of pNB 
esterase. Esterase activities were then measured by trans- 
ferring 20 |al aliquots of the cell cultures into a fresh set 
of plates where they were mixed with 200 |.il of 0.1 M 
Tris-HCl (pH 7.0) 25% DMF and 2 mM LCN-pNP. Reac- 
tion velocities were measured at 450 nm over ten min- 
utes. (11 data points) in a ThermoMax microplate reader 
(Molecular 'Devices, Sunnyvale CA), Activities were nor- 
malized by the cell densities of the original wells 
measured at 600 nm to control for variations in cell 
quantities. 

For each round of screening, the clones that showed 
the highest activities were re-streaked on LB/tetracycline 
agar plates, and single colonies derived from these plates 
(three to four colonies from each clone) were inoculated 
simultaneously into 96-well plates and tube cultures. The 
former were used to repeat the activity assay, as 
described above, and the latter were used for glycerol 
stock and plasmid preparation (Qiawell kit, Qiagen). 

Assay of pNB esterase activity on LCN-pNB 

A modified HPLC assay was used to determine 
enzyme activity towards the LCN-pNB (/7-nitrobenzyl 
ester) substrate (Chen t'^ al, 1995). The bacterial cells 
were incubated at 30^ C with shaking for 12 hours and 
then at 40''C for ten hours to induce expression of pNB 
esterase. Aliquots of cells (200 \\\) were incubated with 
300 |al reaction buffer for 30 minutes at room ternpera- 
ture. The final reaction mixtures contained 0.1 M Tris- 
HCl (pH 7.0) 25% DMF and 2 mM LCN-pNB. The reac- 
tions were stopped by addition of 500 \i\ acetonitrile and 
passed through a nylon syringe filter (Micron Separ- 
ations, Inc., Westboro, MA) with a pore size of 0.45 }im. 
Aliquots of each sample (50 jjl) were analyzed by HPLC 
on a 250 mm X 4,6 mm C18 reverse-phase column 
(Vydac, Hesperia, CA) at room temperature using a 
linear gradient starting with 50:50 of A:B (A = 5% 
methanol /95% 1 mM triethylamine/ pH 2.5; B = 100% 
methanol) zmd ending with pure B in eight minutes 
(flow rate of 1 ml per minutes). Product and substrate 
were detected at 270 nm. The area of the p-nitrobenzyl 
alcohol product peak was calculated and subtracted 
from the area of the same peak from a sample containing 
E. coli without a pNB esterase gene. This controls for the 
small quantities of free product in the substrate prep- 
aration and any interference from bacterial contami-* 
nation. This final area was used as a measure of total 
activity, which was normalized by cell density. 
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Applications of DNA shuffling to pharmaceuticals and vaccines 
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DNA shuffling is a practical process for directed molecular 
evolution which uses recombination to dramatically accelerate 
the rate at which one can evolve genes. Single and multigene 
traits that require many mutations for improved phenotypes 
can be evolved rapidly. DNA shuffling technology has 
been significantly enhanced in the past year, extending its 
range of applications to small molecule pharmaceuticals, 
pharmaceutical proteins, gene therapy vehicles and 
transgenes, vaccines and evolved viruses for vaccines, and 
laboratory animal models. 
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Abbreviations 
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Introduction 

The sequence-design processes used by nature have 
yielded results far superior to those obtained so far by 
rational approaches to the design of biological structures 
and systems. The rational design of proteins, for example, 
is performed by computer modeling of individual changes, 
followed by construction of the corresponding DNA and 
expression of the recombinant protein for testing and 
evaluation. This approach seeks to design proteins for 
specific tasks by drawing correlations between amino 
acid sequence and specific shapes in an attempt to 
understand a complex system through its topography. At 
its root, our limited ability to rationally engineer complex 
biological systems stems from three main factors: limited 
structure/function knowledge of most proteins and of their 
complex interactions with expression machinery and other 
molecules within the cell; the computationally intensive 
and approximate nature of the modelling required; and 
the fact that the number of relevant mutants for which 
one would wish to make predictions in order to optimize 
a given function ranges from millions to numbers of 
astronomical proportions [1]. 

In contrast to rational design procedures, nature em- 
ploys mutation, selection and recombination to evolve 
highly adapted individuals from the effectively infinite 
possibilities encoded implicitly in the genome. Recent 



technological advances have demonstrated that it is now 
possible to mimic these natural evolutionary processes. 
Benchtop in vitro evolution techniques are used to 
construct libraries as large as 10^^ rnolecules [2-5]. 
Researchers can mimic natural evolution by searching 
these libraries by, for example, affinity panning of 
phage-displayed or RNA ligands against pharmaceutical 
targets, for the best candidates for a specific task. Repeated 
rounds of selection and amplification of candidates has 
already produced improved enzymes and novel molecules 
capable of binding their targets with higher affinity than 
their natural counterparts. Unlike natural selection, in 
which multiple environmental forces select organisms with 
genomes that allow them to meet a variety of challenges, 
in vitro evolution exerts focused selection pressure on 
organisms in isolation, enabling the rapid development of 
variants with highly specialized traits. 

Despite the enormous potential of these techniques, 
determination of the best strategy to exploit this diversity 
has been the topic of much debate. The most popular 
methods of creating combinatorial libraries are strategies 
that seek to evolve sequences that have individual point 
mutations or blocks of oligonucleotide encoded mutations. 
At present, most researchers use either repeated cycles 
of 'error-prone PGR' [6,7] or repeated oligonucleotide 
directed mutagenesis [8] to create these 'point mutation* 
libraries. Error-prone PGR employs a low fidelity repli- 
cation step to introduce random point mutations at each 
round of amplification [6]. This method has the advantage 
of simplicity and ease of use. The power of these methods 
is limited, however, by one's ability to identify critical 
regions for mutagenesis and because, generally, only small 
regions of the genome can be mutagenized to saturation 
and be exhaustively sampled in screens or selections, due 
to limited library sizes relative to the size of the sequence 
spaces defined by exhaustive random searches [9]. 

Iterative cassette or point mutagenesis can overcome 
some of these limitations; however, as discussed below, 
DNA shuffling profoundly accelerates the process. In this 
review we summarize recent advances which extend the 
range of application of this technique to small molecule 
pharmaceuticals, pharmaceutical proteins, gene therapy 
vehicles and transgenes, vaccines, and evolved viruses for 
laboratory animal models of disease. 

Recombination 

A key limitation of random point mutagenesis and random 
cassette mutagenesis strategies can be traced to the 
fact that they introduce random 'noise' into the gene 
population at every cycle, and hence improvements are 
limited to small steps. If the noise level is too high 
relative to the library size and the selection stringency, the 
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message will gradually become riddled with deleterious 
mucacions. This is analogous to che phenomenon of 
Muller's Ratchet [10] from population biology in which, 
in the absence of sexual recombination, deleterious 
mutations build up in a population over time. From 
decades of plant and animal breeding, classical breeders 
have learned to use recombination to rapidly evolve 
improved sequences for a specific task. Sexual replication 
in combination with directed selection can produce sub- 
stantial improvements in a highly diverse genome within 
just a few cycles. The widespread prevalence in nature 
of sexual versus asexual reproduction has been the topic 
of much debate [11]. A gene favouring parthogenic rather 
than sexual reproduction would quickly take over in a 
population unless counterveiled by selection, the so-called 
twofold cost of sex. One hypothesized major force' for the 
maintenance of sexual reproduction is that recombination 
of natural diversity allows for populations to rapidly evolve 
in order to adapt to changing physical environments or 
pathogens by combining beneficial mutations into single, 
more fit individuals and deleterious mutations into less fit 
individuals that are then selected out of the gene pool [11]. 

The technique of DNA shuffling comes the closest of any 
laboratory technique to mimicking natural recombination 
by allowing the in vitro homologous recombination of 
DNA [12,13]. This technique not only recombines DNA 
fragments, but also introduces point mutations at a very 
low, controlled rate [14], thus, combining recombination, 
point mutation, and selection techniques to create a 
general, parallel algorithm for evolving improved genes. 
This parallel search strategy is analogous to that used 
by massively parallel supercomputers [9], such as those 
used for climate modelling or to factor very large prime 
numbers. Figures 1 and 2 summarize the practical and the- 
oretical advantages of DNA shuffling relative to existing 
recursive mutagenesis methods, such as error-prone PGR 
or recursive oligonucleotide directed mutagenesis. Recent 
progress with DNA shuffling has clearly demonstrated 
the utility of recombination for accelerating molecular 
evolution through simultaneously permuting both single 
mutations and large sequence blocks (Table 1). DNA 
shuffling combined with focused selection pressure in the 
laboratory will allow one to rapidly evolve genes for a 
wide variety of industrial applications: the optimization 
of enzymes, such as proteases, lipases, amylases and 
cellulases; the development of metabolic pathways spe- 
cialized to synthesize large amounts of specialty chemicals, 
antibiotics, or pharmaceutical proteins; organisms designed 
for bioremediation; and plasmids or viruses for novel 
vaccines and gene therapy applications (Table 2). These 
emerging applications are discussed below. 

Family shuffling dramatically increases the 
rate of st pwise evoluti n 

Forced hybridization between species, such as was 
performed in the cross-breeding of the plum and apricot 



to yield che plumcoc, was recognized last century by plant 
breeders as a highly effective method for generating novel, 
functional varieties with phenotypes differing dramatically 
from either parent [15]. The utility of this strategy has 
now been demonstrated at the level of a single gene. To 
evaluate whether recombining natural diversity accelerates 
the evolution process, the efficiency of obtaining a 
new substrate specificity from four homologous enzyme 
genes evolved separately was compared with that from a 
recombined pool of the four genes. The essential goal was 
to compare the rate of evolution of a single gene that is 
subjected to random mutation and selection to the rate 
of evolution of a library of genes created by recombining 
existing, functional genetic diversity present in a family 
of homologous genes. Since all of the recombinants are 
created from diversity that has proven functional in che 
context of its parental gene, the hope is that such libraries 
would be of much higher quality than random libraries. 
The results affirm this view. One cycle of single gene 
shuffling yielded eightfold improvements from each of 
the four separately evolved genes, versus a 270-540-fold 
improvement from the four genes shuffled together 
([16»*]; Figure 3a). This represents an approximately 
50-fold increase in the rate of improvement per cycle. The 
best clone contained eight segments from three of the four 
genes as well as 33 amino acid point mutations. It is worth 
emphasizing that this evolved improvement relative to the 
initial gene pool was obtained in a single cycle of gene 
shuffling, rather than requiring many recursive cycles. 

Thus, in contrast to classical breeding techniques, DNA 
shuffling allows one to readily recombine DNA derived 
from 'separate* species or genera. This results in a much 
more sparse sampling of sequence space (Figure 3b), in 
which the average similarity between library members is 
much lower than with other strategies. Sparse sampling 
yields mutants that, after a single cycle, are far more 
divergent from the parental genes than is possible 
with single gene shuffling or point mutation strategies. 
This recent experiment demonstrates that cross-species 
recombination is a remarkable accelerant of molecular 
evolution. Family shuffling will be widely applicable to the 
commercially important problems discussed below. 

Protein pharmaceuticals 

Recombinant pharmaceutical proteins form a multi-billion 
dollar sector of the pharmaceutical industry. This industry 
relies principally on cloning existing genes encoding 
cytokines, growth factors, and enzymes that are the 
products of millions of years of evolution. Many of these 
products have side effects that cause serious complications 
which limit or preclude their clinical use. Selective 
breeding using DNA shuffling provides a technology 
for rapidly improving pharmaceutical proteins through 
selective breeding for enhancement of desirable biological 
activities, while eliminating or reducing undesired activi- 
ties (Table 2). 
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We have recently applied DNA shuffling co members of 
the human a-IFN gene family (85-97% pairwise amino 
acid identity). Greater than 1026 distinct recombinants can 
be generated from the natural diversity in this gene family 
(Figure 4a). While no foreseeable library technology will 
allow an 'exhaustive* sampling of this sequence space, 
DNA shuffling technology provides a design algorithm 
with which to selectively breed for IFNs with increased 
potency relative to naturally occurring IFNs. Typical 
chimeras produced by DNA shuffling are shown in 
Figure 4b. Generic high-throughput methods for a-IFN 
expression and biological assay as fusion proteins on phage 
have been developed and used for rapid parallel analysis 
of recombinant IFNs. Phage-displayed recombinants with 
improved potency on human and murine cells have been 
obtained (Figure 4). 

This generic approach can be applied to improve many 
pharmaceutical proteins. Proteins with novel activities 
have been created by directed chimerization of modules 
from existing pharmaceutical proteins [17,18], a strategy 
that is likely to be particularly effective with cytokines. 



which typically act by dimerizing two or more receptor 
components. As with the rapid evolution of moxolactamase 
activity by shuffling a cephalosporinase gene family [16**], 
many new pharmaceutical activities may be discovered 
through breeding large libraries of chimeric pharmaceu- 
tical proteins. Selective breeding using DNA shuffling 
will allow rapid evolution of pharmaceutical proteins with 
potent activities from such recombinants, which initially 
have low levels of the desired activity. Backcrossing of 
these evolved variants with the wild type genes will allow 
one to remove functionally neutral changes, thus reducing 
the immunogenicity of the evolved proteins. 

Small molecule pharmaceuticals and 
industrial enzymes 

Microorganisms are widely used for the production of 
pharmaceutical molecules, such as antibiotics, antifungals, 
anticancers and immunosuppressives. In many cases, the 
genes encoding the relevant biosynthetic enzymes are 
known, often occurring in operons or gene clusters. 
Rational engineering of these biosynthetic pathways to 
improve yield or generate analogs is difficult because, 
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DNA shuffling methodology. The first step of this method is to randomly fragment a population of related genes using DNAse K This produces 
fragments of various lengths that, after denaturation, hybridize to form an equal mixture of 5' and 3' overhangs. Using PGR techniques, the 
5' overhang fragments can be extended by Taq DNA polymerase - leaving the 3' overhang fragments unaffected. As a consequence of this 
extension, the average fragment length increases during each cycle. Recombination occurs when a fragment derived from one template primes 
a template with a different sequence [13], Green dots represent beneficial mutations and red dots represent deleterious mutations. The coloured 
bars indicate recombinations of portions of three parents into recombinant progeny. 



in addicion to the difficulties of protein engineering, the 
determination of rate limiting steps in the pathway is 
laborious and uncertain. DNA shuffling is well suited 
to the optimization of such pathways because the entire 
pathway can be treated as the unit for evolution, with 
no requirements for knowledge of the rate limiting steps 



or for detailed structure/function analysis of the proteins. 
Pathways for the detoxification of atrazine (J MinshuU, 
personal communication) and arsenate [19*] have been 
improved using DNA shuffling (Table 1). Importantly, 
in these examples no a priori knowledge was needed 
to yield significant improvement. A benzylesterase used 
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The advantages of DNA shuffling as a sequence design algorithm for evolving complex new gene functions are shown schematically. The 
coloured dots represent beneficial mutations. The vertical direction represents a generalized measure of fitness (i.e. kcat/Km for an enzyme) 
with the fitest genes being at the top. Because .the frequency of beneficial mutations is generally low relative to deleterious mutation, only single 
beneficial mutations are generally added in each cycle of random mutagenesis and screening or selection. Hence, procedures that use iterative 
point mutations must build up beneficial mutations one at a time through many rounds of selection, generally with only the best mutant from 
any given cycle being pursued. In contrast, DNA shuffling allows one to directly recombine all beneficial mutations from any given round into 
multi-step mutants with dramatically improved phenotypes. 



industrially for deproceccing a precursor of the antibiotic 
loracarbef has been improved using DNA shuffling ([20]; 
Table 1). Recombination was shown to be superior to 
sequential error-prone point mutation for the evolution 
of this activity [21']. DNA shuffling has also recently 
been used to modify the specificity of a cRNA charging 
enzyme [22], with the ultimate goal of evolving tRNA 
synthetases that can specifically charge tRNA*s with 
unnatural amino acids incorporated at specific sites. 

Evolved enzymes will find wide application in the 
replacement of multi-step chemical reactions required for 
manufacture of drugs or their precursors by an enzymatic 
conversion. Most naturally occurring enzymes capable of 
such valuable conversions require significant modification 
in activity, specificity, or expression level before they are 
suitable for large scale drug manufacture. DNA shuffling 
provides an important tool for the optimization of such 
enzymatic conversions. 



Evolved viruses for pharmaceutical 
applications 

The full length genomes of many viruses are in the range 
of 5-15kilobases (kb), a size range that can be readily 
handled by current DNA shuffling methods (Table 1). 
Our ability to clone and sequence the wealth of natural 
viral isolates far outstrips our molecular understanding and 
our ability to rationally manipulate them. Three wild-type 
strains of human papilloma virus have been successfully 
shuffled (D Apt, personal communication). The biological 
properties of this library of recombinants are currently 
being investigated. This approach has potential for the 
evolution of human papilloma virus to overcome the 
blocks to growth in transformed fibroblasts, and thus be 
able to grow in readily manipulated tissue culture systems 
for drug screening. 

Adenovirus is widely used as a gene therapy vehicle. 
Over 100 naturally occurring serotypes with differing 
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(a) The strategy and results from shuffling four homologous cephalosporinase genes are shown schematically. Single gene shuffling resulted 
in eightfold Increased resistance to the antibiotic moxolactam, whereas shuffling the gene family gave a 270-540-fold increase in resistance in 
a single step [16]. (b) Evolution starting from a single gene is schematically contrasted with evolution based on shuffling a homologous gene 
family. The axes denote a generalized sequence space. Shaded dots indicate particular sequences present in a given library. Hatched dots 
represent sequences that are more fit than the best parental molecule. Greater distance indicates greater sequence divergence. Family shuffling 
results in a relatively 'sparse' sampling of sequence space with relatively few individuals that are highly similar to the parental molecules and 
many individuals that are very divergent from the parents. 
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Development of novel human therapeuti^t hrough molecular breeding technology. 
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tropisms are known. The fiber and pencon genes of 
adenovirus are the major determinants of tissue tropism, 
as they are responsible for cell adhesion. The pentons 
of adenoviruses interact with cellular integrins [23] and 
the fibers with cellular receptors, one of which has been 
newly identified [24]. Evolved adenoviral variants which 
could be selectively targeted to particular cell types 
would be of great utility for gene therapy and vaccine 
delivery vectors. The penton and fiber genes of various 
adenovirus serotypes have been shuffled (S Liu, personal 
communication) and this approach may allow one to evolve 
mutant adenoviral vectors with desired cell and tissue 
tropism. 

Murine leukemia virus (MLV) is a retroviral vector that 
has received much attention as a gene therapy vehicle. As 
with adenovirus, there are <100 naturally occurring strains, 
whereas only a single MLV strain is being developed for 
gene therapy applications (Moloney MLV). The envelopes 
of 15 MLV strains have been shuffled and viruses are being 
selected for improved tropism, titer, stability and gene 
expression properties (N Soong, personal communication). 

HIV-1 poses a major threat to human health which is 
increasing because of the growing viral load worldwide, 
the high rate of evolution of this pathogen, and the con- 
comitant evolution of associated opportunistic pathogens 



in HIV-1 infected individuals. There is currently no 
practical animal model in which to test the multitude of 
antiviral drugs or vaccine strategies [25]. Work is beginning 
on the genetically engineered animal models to support 
replication of HIV-1 (D Littman, M Goldsmith, personal 
communication), but no replication has yet been observed 
in these hosts. It is clear from viral phylogenetic trees 
that Antiviruses can evolve the ability to grow in new 
species and it is clear that recombination plays a major 
role in the natural high rate of evolution of lentiviruses 
[20]. DNA shuffling provides a powerful new tool with 
which to accelerate the adaption of viruses to grow in 
laboratory animal hosts. Recombination is believed to 
be of great importance for the naturally high rate of 
evolution of retroviruses [26]. Laboratory animals have 
been engineered with the human HIV-1 receptor and 
co-receptor genes (D Littman, personal communication). 
DNA shuffling is being used to recombine entire genomes 
and individual genes of natural HIV-1 isolates to accelerate 
the adaptation of HIV-1 to grow in these engineered 
animals (P Patten and N Landau, unpublished data). We 
anticipate that the adaptation of HIV-1 to replicate in 
a laboratory animal will open up many fertile avenues 
for drug and vaccine discovery on this important human 
pathogen. Shuffling of natural diversity to create large 
libraries of chimeric viruses is a general approach that 
can be applied to other viruses, such as the hepatitis B 
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Figure 4 
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(a) Human a-interferon diversity. The sequences of eleven natural human a-lFN sequences are shown [29L Consensus a-IFN .s given at the top. 
Dashes indicate identity to consensus. The number of distinct recombinants that can be generated by shuffling '►^^/e* ^r5T4'4=:^x O^eVw 
This number is calculated by multiplying the number of different amino acids observed at each polymorphic J^sexSiSx 4 1° >• « 
Sixteen representative recombinant a-IFNs derived by shuffling eight natural human ""'FN genes are shown schemat^cal^ 
data). The high crossover and low crossover libraries were generated by shuffling 20-50 bp or S<^^00bp ragmen s r^^^^^^^ 
shuffling was done essentially as described in [16]. The shuffled IFNs were expressed as fusions to gene III on f^.^^ 
screened for antiproliferative activity using Daudi cells. Phage displayed alFN-MAX4 (high crossover library, fourth from the top) is 40-fold more 
active in a Daudi antiproliferation assay than IFN2a and twofold more active than consensus 1 IFNs displayed on phage. 
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virus and hepatitis C virus, for the purpose of producing 
domescicaced forms of these viruses (i.e., easier to handle 
in the laboratory) for vaccine development and drug 
screening on variants that can readily be grown in tissue 
culture systems. 

Vaccines 

Development of effective vaccine technologies has stim- 
ulated renewed government emphasis and interest from 
pharmaceutical researchers. DNA vaccines are particularly 
attractive because of their relatively low cost and the 
feasibility of rapid generation of diverse variant vac- 
cines containing evolved promoters, immunostimulatory 
sequences, cytokines, etc., for comparison testing [27], 
For example, assays for selection of DNA vaccines 
with improved promoter activity, immunostimulatory se- 
quences, enhanced expression levels or cell tropism can 
be developed to produce second generation DNA vaccine 
plasmids. Viral vaccine vectors can be enhanced by DNA 
shuffling to give desired properties* of tropism, stability 
and expression level. The promise of this new technology 
notwithstanding, existing human vaccines rely on live 
or killed whole organisms or components purified from 
whole organisms. We envision opportunities for DNA 
shuffling as a tool for increasing the efficiency and success 
rate of the development of novel whole organism, viral, 
bacterial and recombinant protein vaccines. Pathogenic 
viruses can be subjected to DNA shuffling followed by 
selection for desired attenuation properties while retaining 
the immunogcnicity required for a vaccine. Viruses that 
would serve as excellent vaccine vectors or as vaccines 
but which cannot be manufactured to sufficient titer in 
manufacturing cell lines, can be shuffled and selected for 
improved titre to create new commercial opportunities. 
Recombinant proteins that are known to be excellent 
vaccine immunogens but which cannot be manufactured 
in appropriate yield or in suitable host systems can 
be shuffled and screened to solve such expression 
problems, while co-selecting for retention of necessary 
epitopes. These and other valuable opportunities for 
application of DNA shuffling technologies in vaccinology 
are summarized in Table 2. 

The impact of genomics 

The rapid rate of increase in the availability of known gene 
sequences and our ability to manipulate them in cloned 
form greatly exceeds our understanding of these genes and 
our ability to engineer these sequences based on rational 
models. Sequence information from informatics databases 
can readily be converted into functional DNA clones given 
tools such as PGR, synthetic DNA and methods for the 
rapid assembly of genes from synthetic DNA [28]. 

Shuffling of natural diversity to explore the sequence 
space defined by shuffled homologues is a demonstrably 
powerful strategy for accelerated evolution of biological 
molecules with novel activities ([16]; Figure 3). We 
expect the dramatic success seen with family shuffling 



of cephalosporinase genes to be repeated in many other 
systems in which homologous genes are recombined. The 
explosion of DNA sequences available on the internet 
provides a rich, diverse and rapidly expanding supply of 
molecular breeding stock. DNA shuffling is a general and 
natural algorithm for functionally exploiting this natural 
diversity, with minimal requirements for a priori genetic 
or biochemical characterization to guide this exploitation. 

Conclusions and future directions 

In order to unlock desired biologically active sequences 
from the potential diversity present in an organisms 
gene pool, it is of great inriportance to understand which 
evolution algorithms are the most effective. Point muta- 
tion techniques, such as error-prone PGR and repeated 
oligonucleotide directed mutagenesis, search sequence 
space by creating libraries of randomly mutated molecules. 
In contrast, DNA shuffling exchanges large functional 
blocks of sequence containing previously selected mu- 
tations to search for the best candidate molecules, 
thus mimicking and accelerating the process of sexual 
recombination. We expect that, just as recombination 
has played a major role in the evolution of life, DNA 
shuffling will play a central role in the development of 
applied molecular evolution technologies and will prove 
indispensable for bringing existing biological diversity into 
the service of human health care, and agricultural and 
industrial chemical needs. 
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Codon usages in 22361 genes can be analyzed using the nucleotide 
sequence data obtained from the GenBank Genetic Sequence Data 
Bank (Release 69.0, Sept., 1991). The database is called as the 
CUTG Database (1 -4), and is distributed on EMBL CD-ROM 
(December 1991 ; CODON by Wada et al.) as a member of NAR 
Sequence Supplement Databases (5). The CUTG codon database 
is also available for on-line access to DDBJ (DNA DATA BANK 
OF JAPAN): Please, contact with DDBJ (Mail Address; 
ddbj@ddbj . ni g . ac . jp) . 

Files named as ***.CODON.69 list the codon use in each of 
gene registered in the GenBank Sequence files (gb***.seq). The 
LOCUS names given in the GenBank were used for designating 
individual genes, and the SHORT DIRECTORY of the GenBank 
Short Directory File is presented for defining each LOCUS name 
analyzed here (see ***.SDR.69), 

To reveal the characteristics of the codon use of a wide range 
of organisms, as well as viruses and organella, the frequency 
(per one thousand) of codon use in each organism for which more 
than 20 genes are available was calculated by summing up 
numbers of codon use (***. total, 69); Table 1 of this paper. The 
number of genes sunruned for each organisms is given in the row 
designated as GENES, and the total codon number thus summed 
is given at the bottom row. Names of the organisms are listed 
in the SPECIES FILE (Table 2). Amino acids are added simply 
according to the universal codon table. 

METHODS 

In selecting protein coding sequences we relied on the 
FEATURES tables of the GenBank, and only complete genes, 
starting with an initiation codon and ending with one of stop 
codons, were used in the analysis (see REFERENCES for 



details). In the GenBank, a group of consecutive genes whose 
entire region had been sequenced were registered under one 
LOCUS name. To distinguish the different genes belonging to 
a single LOCUS, symbol # followed by a number is added after 
the LOCUS name; the numbers represent the order of the peptides 
registered in the FEATURES of the GenBank. When introns of 
a gene have not been completely sequenced, some of its exons 
are registered in separate entries (LOCUS) in the GenBank. These 
exons belonging to the same gene but having different LOCUS 
names were combined, and the LOCUS name of the last exon 
followed by symbol * was given to the gene thus combined. The 
order of the codons in the table is the same as the previous 
compilation (see the CODON_LABEL file or REFERENCES). 
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!SSiS35 
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a R^SSSS3D3SSg2SS5a§5S32Sg53SgSa5§a535S3S 
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2 a3H323S55333SS2333S333S| 



3333 



§§§§8 




3555333gd|^ 
§53^3-2533^1333 



l88 



6 




RC 


SPI 


TOB 


WHT 


YSC 






CP 


CP 


CP 


CP 


MT 


No.GENES 


118 


50 


69 


28 


SO 


ARG 


CGA 


\2J3 


12.3 


117 


11.7 


ai 




CGC 


&2 


&1 


4.6 


62 


a4 




CGG 


SlO 


4.4 


4.1 


28 


as 




CGU 


13.7 


14.7 


182 


13.4 


X3 




AGA 


18^ 


14.9 


ia6 


ia4 


24.0 




AGG 


ao 


7.3 


5.4 


64 


23 


LEU 


CUA 


15.1 


UJ 


13.6 


10.4 


7.9 




cue 


a2 


&6 


4^ 


1O0 


09 




CUG 


02 


ao 


7.1 


72 


25 




CUU 


24.5 


2Z2 


19>i 


17.8 


56 




UUA 




32.7 


31.5 


31.6 


1004 




UUQ 


aos 


21.7 


202 


16.8 


a3 


SEfl 


UCA 


122 


122 


a7 


a? 


2Sl4 




UCC 




109 


uja 


loa 


03 




UCG 


as 


a2 


4.1 


as 


1.7 




UCU 




21 i) 


laa 


HA 


174 




AGC 


&1 


<S 


&0 


as 


20 




AGU 


14.0 


14.0 


126 


142 


127 


7HH 


ACA 


laa 


14.9 


UA 


17.6 


21.7 




ACC 


106 


11.6 


13l8 


11.1 


as 




ACQ 


5.7 


S2 


Sid 


73 


21 




ACU 


23.6 


21.9 


224 


24.4 


laa 


PRO 


CGA 


11J 


12.9 


104 


127 


132 




CCC 


Qi3 


a2 


7A 


102 


26 




OCQ 


5.6 


47 


a9 


ai 


1.7 




ecu 


ia.1 


19.6 


21.1 


19.8 


lao 


ALA 


GCA 


ia7 


ia3 


21.1 


204 


162 




GCC 


as 


11J 


132 


123 


as 




GCQ 


7.5 


7.9 


7.5 


a7 


24 




GCU 


2B.5 


31.0 


372 


327 


24.5 


QLY 


GGA 


27.7 


26.9 


31 J 


26.8 


120 




QGC 


ao 


03 


ft2 


134 


21 




GGO 


1&0 


lOB 


107 


iai 


<0 




GGU 


23.5 


2a3 








VAL 


GUA 


220 




2915 


24.7 


24.0 




Que 


72 


7.0 


as 


102 


IS 




GUG 


ae 


as 


as 


11.1 


28 




GUU 


732 


22.6 


23.S 


204 


20S 


LY3 


AAA 


3S.7 


3&1 


3Z7 


304 


606 




AAG 


150 


12S 


1O0 


172 


103 


ASN 


AAC 


11.9 


12>« 


129 


OS 


a7 




AAU 


27.6 


X.9 


29.1 


242 


8a3 


GLN 


CAA 


34.9 


Z7A 


27S) 


225 


19.8 




CAQ 


a3 


ai 


93 


11.1 


24 


HIS 


CAC 


sja 


72 


a4 


as 


25 




CAU 


ia4 


1910 


^eA 


11.5 


17,4 


GLU 


QAA 


30.7 


41 X) 


4&4 


425 


27.7 




GAG 


UA 


132 


132 


las 


52 


ASP 


GAG 


as 


94 


as 


142 


ai 




GAU 


2ai 


3a7 


29.6 


31.9 


31J 


7YH 


UAC 


a2 


7.1 


as 


ai 


62 




UAU 


2a5 


24.4 


22.5 


23.8 


474 


CYS 


UGC 


06 


27 


1.9 


3L8 


06 




UGU 


a7 


7.6 


ai 


10.4 


ai 


PHE 


UUC 


21.7 


19.6 


17J 


17.9 


17.7 




UUU 


363 


35.4 


27>4 


2a9 


34.8 


1L£ 


AUA 


2Z1 


21.1 


17.7 


iai 


27.9 




AUG 


17.2 


iai 


^7A 


iai 


11.0 




AUU 


3a9 


36.8 


39.3 


402 


TOS 


MET 


AUG 


24.7 


220 


24.0 


24.6 


21.6 


TRP 


UGG 


17.9 


18.6 


14.9 


14.9 


ai 


TEH 


UAA 


ai 


1.5 


20 


1.9 


24 




UAG 


02 


OS 


OS 


1.9 


02 






9.2 






—LL- 


11.7 




TOTAL 


23863 


17471 


19104 


5203 


14356 



Table 2. 



***** PRI (Primate genes) 
CHP Chimpanzee 
HUM Human 



***** ROD (Rodent genes) 
CRU Chinese hamster 
GPI Guinea pig 
HAM Hamster 
MUS Mouse 
RAT Rat 

***** MAM (Mammalian genes other than those in 

PRI and ROD files) 
BOV Bovine 
DOG Dog 
PIG Pig 
RAB Rabbit 
SHP Sh ep 
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VRT (Genes of Other Vertebrates) 

CHK Chicken 



DUK 


Duck 


ONH 


Salmon 


SMO 


Trout 


XEL 


Xenopus laevis 




INV (Invertebrate genes) 


APL 


Aplysia 


BMO 


Bombyx morl 


CEL 


Caenorhabelitis elegans 


CHI 


Chironomus 


DDI 


Dictyostelium discoideum 


DRO 


Drosophila 


MOT 


Manduca sexta 


PFA 


Plasmodium 


SCM 


Schistosoma 


SUP 


Sea urchin (P.miliaris) 


SUS 


Sea urchin (S.purpuratus) 


TRB 


Trypanosoma brucei 



PLN (Plant genes) 



ATH 


Arabidopsis 


BLY 


Barley 


BNA 


Brassica napus 


CRE 


Chlamydomonas 


EME 


Aspergillus nidulans 


MZE 


Maize 


NEU 


Neurospora crassa 


PEA 


Pea 


PHV 


Bean 


POT 


Potato 


RIC 


Rice 


SLM 


Physarum 


SOY 


Soybean 


SPI 


Spinach 


TOB 


Tobacco 


TOM 


Tomato 


WHT 


Wheat 


YSA 


Yeast (Candida) 


YSC 


Yeast (S.cerevisiae) 


YSK 


Yeast (K.tactis) 


YSP 


Yeast (S.pombe) 



BCT (Bacterial genes) 



ACC 


Acinetobacter 


AFA 


Ak:aligenes 


ANA 


Anabaena 


ATU 


Agrobacterium 


AVI 


Azotobacter vinelandii 


BAC 


Bacillus 


BPE 


Bordetella 


CKT 


Chlamydia 


CLO 


Clostridium 


COR 


Corynebacterium 


DVU 


Desulfovibrio 


ECO 


Escherichia coll 


ERW 


Erwinia 


FDI 


, Cyanobacterium (F.diplosiphon) 


FPL 


F plasmid (from E.coli) 


HAL 


Halobacterium 


HEI 


Haemophilus influenzae 


INS 


Insertion element 


KPN 


Klebsiella 


LAC 


Lactococcus lactis 


MBI 


Methanobacterium thermoautotrophicum 


MSG 


Mycobacterium 


MVA 


Methanococcus vannielii 
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NGO Neisseria 

PRM Prot us 

PSE Pseudomonas 

RIO PlasmidRlOO 

RCA Rhodopseudomonas capsulata 

RHB Bradyrhizobium japonicum 

RHM Rhizobium 

RSP Rhodospirillum rubrum 

RSS Rhodobacter sphaeroides 

SHF Shigella flexneri 

SMA Serratia marcescens 

SSP Sulfolobus SSV1 viruslike particle 

STA Staphylococcus 

STM Streptomyces 

SIR Streptococcus 

STY Salmonella typhimurium 

SYC Synechocystis 

SYO Synechococcus 

TFE Thiobacillus 

TIP Agrobacterium Ti plasmid 

TRN Transposon 

TTH Thermus 

VIB Vibrio 

YEP Yersinia 



PT7 Bacterbphage T7 

P2A Bacteriophag PZA (from B.subtilis) 

*'••• ORG (Organelle genes) 

BOV MT Bovine mrtochondrion 

CPA CY C-paradoxa cyanelle 

CRE CP Chlamydomonas chtoroplast 

DRO MT Drosophila mitochondrion 

EGR CP Euglena chloroplast 

MPO CP Marchantia polymorpha chloroplast 

MUS MT Mouse mitochondrion 

MZE CP Maize chbroplast 

MZE MT Maize mitochondrion 

PAR MT Paramecium mitochondrion 

PEA CP Pea chloroplast 

RAT MT Rat mitochondrion 

RIC CP Rice chloroplast 

SPI CP Spinach chloroplast 

TOB CP Tobacco chloroplast 

WKT CP Wheat chloroplast 

YSC MT Saccharomyces cerevisiae mitochondrion 



VRL (Viral genes) 



ADR 


Adenovirus 


ASV 


African swine fever virus 


BTV 


Bluetongue virus 


FLA 


Influenza virus A 


FLB 


Influenza B 


HIV 


Human immunodeficiency virus 


HPB 


Hepatitis B virus 


HS1 


Herpes simplex virus type 1 


HS2 


Herpes simplex virus type 2 


HS4 


Epstein-Barr virus 


HS5 


Cytomegalovirus 


HS6 


Human herpesvirus type 6 


HSE 


Equine herpesvirus 


HSV 


Herpesvirus saimiri 


MCV 


Cucumber mosaic virus 


MEA 


Measles virus 


MHV 


Murine hepatitis virus 


NDV 


Newcastle disease virus 


NPA 


Autographa californica nuclear polyhedrosis virus 


PAF 


Parainfluenza virus 


PIP 


Human parainfluenza virus 


PLY 


Polyomavlrus 


PPH 


Human papillomavirus 


REO 


Reovirus 


RSH 


Respiratory syncytial virus 


SIV 


Simian immunodeficiency virus 


SND 


Sendai virus 


VAC 


Vaccinia virus 


VAZ 


Varicella-Zoster virus 


VSV 


Vesicular stomatitis virus 


WHV 


Woodchuck hepatitis virus 



PHG (Phage g nes) 

F1C Bacteriophage f1 

LAM Bacteriophage lamtxia 

P22 Bacteriophage P22 

PMU Bacteriophage Mu 

PP1 Bact riophage P1 

PRD Bact riophage PRD1 

PT3 Bact riophage T3 

PT4 Bacteriophage T4 
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Fast and sensitive multiple sequence 
alignments on a microcomputer 

Desmond G.Higgins* and Paul M.Sharp 



Abstract 

A strategy is described for the rapid alignment of many long 
nucleic acid or protein sequences on a microcomputer. The 
program described, can handle up to 100 sequences of 1200 
• residues each. The approach is based on progressively aligning 
sequences according to the branching order in an initial 
phylogenetic tree. The results obtained using the package appear 
to he or. sensitive as those from any other available method. 

Introduction 

In the recent literature on biological sequence analysis, at least 
a dozen methods for performing multiple alignments of nucleic 
acid or protein sequences have been described [e.g. Bains 

(1986) , Sobel and Martinez (1986), Barton and Sternberg 

(1987) /, Feng and Doolitde (1987), Santibanez and Rohde 
(1987), Taylor (1987)]. The motivation for this effort has been 
the need for the automatic alignment of three or more sequences 
for the purposes of evolutionary or stmctural comparisons or for 
attempting to demonstrate similarity between sets of sequences. 
In this paper, we describe a strategy which we believe offers 
the best combination of speed and sensitivity available for any 
multiple alignment method. We offer a program which can 
perform multiple alignments of up to 100 sequences of 
maximum length 1200 residues on a microcomputer in a 
reasonable amount of time. We judge the program to be 
•sensitive' because the results obtained are very difficult to 
improve by eye. 

The sti-ategy we use is essentially that of Feng and Doolitde 
(1987) adapted for use on microcomputers. The general 
approach is to progressively align groups of sequences accord- 
ing to the branching order in a hypothetical phylogenetic tree, 
with gaps that occur in earlier alignments being preserved 
through later stages. At each alignment stage, a two-sequence 
alignment algorithm, such as the dynamic programming method 
of Needleman and Wunsch (1970), is lised. For two sequences, 
the Needleman and Wunsch algorithm gives an alignment that 
is guiaranteed to be optimal for a given set of scoring rules 
(i.e. weights for aligned residues and penalties for gaps). When 
this method is used to align two sets of sequences, the score 
at each position in the alignment is taken from die average score 
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for each residue in one set compared against each residue in 
the second set. Any gaps introduced into either set of sequences 
are scored as single gaps. The main difficulty in using this 
approach on a microcomputer arises from the excessive memory 
requirements of the Needleman and Wunsch (1970) . method— 
memory usage is proportional to the square of the average 
sequence length. 

In a previous paper (Higgins and Sharp, 1988) we described 
a strategy for the very rapid multiple alignment of large numbers 
of sequences on a microcomputer. This method also comprised 
a progressive approach, using the fast, but approximate, two- 
sequence alignment method of Wilbur and Lipman (1983). 
While this approach is extremely rapid and economical with 
core memory, it works well only for closely related sequences. 
We did not consider using the exactly optimal method of 
Needleman and Wunsch (1970) for the progressive alignments 
because of die excessive memory requirements. However, a 
recent paper by Myers and Miller (1988) demonstrates how 
to achieve exacdy optimal alignments of two sequences where 
memory usage varies only linearly wiUi sequence length, 
without making use of bit packing or secondary disk storage. 
Thus, a progressive series of alignments of larger and larger 
groups of sequences, using die method of Myers and Miller 
(1988) for each alignment, is the key to the current approach. 

System 

The program described in tiiis paper was written in standard 
FORTRAN 77 and compiled using the Microsoft FORTRAN 
compiler, version 4.0. Program performance was tested on an 
IBM AT compatible microcomputer, running at 10 MHz with 
no maths coprocessor, 640 kbytes of memory and a hard disk. 
This program (CLUSTAL4) is an extension to the package 
described in Higgins and Sharp (1988). Copies of the executable 
fdes, documentation and test data files will be sent on request. 
Please send three 5.25 inch floppies formatted to 360 kbytes, 
or one high density 5.25 inch floppy formatted to 1.2 Mbytes. 



Algorithms 

The program takes, as input, a dendrogram produced by 
applying die UPGMA mediod (Sneadi and Sokal, 1973) to a 
matrix of similarity scores between all pairs of sequence to be 
aligned. The similarity scores are calculated as the number of 
exacdy matched residues in a Wilbur and Lipman (1983) align- 
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ment between two sequences, minus a fixed penalty for every 
gap. For short sequences, several similarity scores per second 
can be calculated on a microcomputer using the package describ- 
ed in Higgins and Sharp (1988). , 

The sequences are then aligned in groups corresponding to 
the branching order in the dendrogram. The alignments are 
carried out using the method of Myers and Miller (1988), 
adapted for use in a multiple alignment context. Myers and 
Miller took the distance minimizing algorithm of Gotoh (1982) 
and applied a ^divide and conquer' strategy (attributed to 
Hirschberg, 1975) to give alignments in linear space; as a 
consequence memory usage is lineariy related to sequence 
length. Briefly, the method is based on finding the optimal 
mid-point of an alignment. When this is found, the matched 
symbols (two aligned residues, or one residue opposite a gap) 
are part of the final alignment. The rest of the alignment is found 
by recursively finding optimal mid-points on either side of the 
initial mid-point. In this context, the optimal mid-point can be 
defined as the aliped symbols at the centre of the optimal 
alignment: The centre is taken to be half way along one se- 
quence. 

Two modifications were needed to adapt the Myers and Miller 
algorithm for our program. Firstly, all real number operations 
w:ere converted to using 2-byte integers. On a 16-bit micro- 



computer without a maths chip, this increases the speed of each 
alignment by a factor of 30. Indeed the speed approaches tiiat 
described in Myers and Miller for their program running on 
a VAX 1 1/780. Secondly, the scoring system was modified to 
allow all residues at a given position in each group of sequences 
to contribute to the alignment scores. For proteins, we use the 
log-odds amino acid similarity matrix of Dayhoff (1978) to score 
aligned residues. The similarity matrix was rescaled to give 
positive integer weights between 0 and 25 and then converted 
to a difference matrix by subtracting each value from 25. Thus 
two aligned tryptophans have the lowest distance (0) while a 
cysteine aligned with a tryptophan has the largest distance (25). 
For nucleic acid sequences we use a three-tier weighting system 
where identical residues have zero distance, transitions have 
a distance of 5 and trans versions have a distance of 10. A 
variable gap penalty is used; a fixed penalty is added to the 
alignment distance for every gap and an extra penalty is added 
for every item in the gap. Gaps that are introduced into a pre- 
aligned group of sequences are scored as single gaps. Botii of 
these penalties can be specified at run time. 

In order to calculate the alignment scores between two clusters 
of sequences, the gaps that are already inserted in the two 
clusters (from eariier alignment stages), are treated as being 
fixed. Thus, each cluster may be thought of as a single sequence. 
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Fig. 1. Times required for the multiple alignment of different numbers of 
sequences of different lengths. Each curve represents the times for truncated 
fragmentspf a given length, L; this example used the HIV pol protein. Times 
^« similarity matrices or dendrograms (maximum of <3 min) 
are not included. 
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Fig. 2. Times for aligning different numbers of sequences. This example used 
globin sequences (alphaglobins, betaglobins and myoglobins) each truncated 
to 140 residues. The four curves show times for different parts of the multiple 
alignment process: (a) total time, including calculation of similarity matrix and 
dendrogram, (b) calculation of similarity matrix, (c) multiple alignment 
(d) construction of UPGMA dendrogram. 
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where the residues at each position are the average' residues 
at that position in each of the sequences represented by the 
cluster. In order to calculate the weight between a position in 
one cluster and another position in the second cluster, one takes 
the arithmetic average of all the pairwise weights between each 
residue in one cluster versus all of those in the second. For 
two clusters with K and L sequences each, this involves taking 
the average of K times L weights at each alignment position 
This becomes very time-consuming with many sequences, but 
can be speeded up by pre-calculating the weight of each posi- 
tion in each cluster versus each possible residue. 

Results and discussion 

The speed of the program can be demonstrated by alignine 
different numbers of sequences of different sizes. We find that 
the speed is almost totally independent of the characteristics 
of the sequences, apart from length. This was also noted by 
Myers and Miller (1988) for their two-sequence alignment 
program. Figure 1 shows the times required for a se^ries of 
multiple alignments of sequences from 200 to 1000 amino acids 
in length. One expects the alignment times to vary with the 
square of the sequence lengths and a visual inspection of the 
figure confirms this. The time required to align different 
numbers of sequences varies approximately linearly. A slight 
departure from linearity is evident with the longer sequences. 
This confirms the effectiveness of our stategy of pre -calculating 
the weights at different positions in each cluster. The times 
required for calculating the initial similarity matrices and 
construction of the dendrograms are not shown. These only need 
to be calculated once for any multiple alignment. The slowest 
dendrogram to construct was that for the ten 1000 residue 
sequences. This took under 3 min. 

Figure 2 shows the times required to align from 10 to 90 
sequences of 140 amino acids each. In this case the times for 
, each of the various calculations are shown. The similarity 
matrices and dendrograms were constructed using the programs 
CLUSTALl and CLUSTAL2 (Higgins and Sharp, 1988) 
respectively. For large numbers of sequences, the calculation 
of the initial similarity matrix is the dominant time-consuming 
factor. For 90 sequences, this requires the calculation of 4005 
values. Nonetheless/the times involved are quite practical on 
a microcomputer. 

The sensitivity of the program is more difficult to demon- 
strate. Our basic criterion in determining sensitivity is to assess 
the ease with which the resulting alignments can be improved 
by manual adjustment. By this criterion, we find the results of 
our program to be excellent. In this respect, the program can 
be used confidently to replace the usual manual alignment of 
sets of closely related sequences for publication. Of greater 
scientific importance is the usefulness of the program for 
aligning regions of homologous secondary structure or in 
reconstructing evolutionary events between distantly related 



sequences. This is more difficult to demonstrate. Banon and 
Sternberg (1987), Feng and Doolinle (1987) and Taylor (1987) 
discuss the.se questions in detail. It is possible that no single 
method will be ideal for these purposes. As a general observa- 
tion, we find the alignments produced bv our prosram to be 
• at least as good as those produced by each of the above authors. 
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Phylogenetic Trees 
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Summary. A progressive alignment method is de- 
scribed that utilizes the Needleman and Wunsch 
pairwise alignment algorithm iteratively to achieve 
the multiple alignment of a set of protein sequences 
and to construct an evolutionary tree depicting their 
relationship. The sequences are assumed a priori to 
share a common ancestor, and the trees are con- 
structed from difference matrices derived directly 
from the multiple alignment. The thrust of the 
method involves putting more trust in the compar- 
ison of recently diverged sequences than in those 
evolved in the distant past. In particular, this rule 
is followed: "once a gap, always a gap.'' The method 
has been applied to three sets of protein sequences: 
7 superoxide dismutases, 1 1 globins, and 9 tyrosine 
kinase-like sequences. Multiple aflignments and phy- 
logenetic trees for these sets of sequences were de- 
termined and compared with trees derived by. con- 
ventional pairwise treatments. In several instances, 
the progressive method led to trees that appeared 
to be more in line with biological expectations than 
were trees obtained by more commonly used meth- 
ods. . 

Key words: Multiple sequence ahgnments — Evo- 
lutionary trees 



Introduction 

The evolutionary relationships of sets of protein (or 
nucleic acid) sequences are commonly depicted in 
the form of trees (Fitch and Margoliash 1967; Day- 
hoffet al. 1972; Moore et al. 1973; Sankoff et al. 
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1982; inter alia). Indeed, the digital nature of se- 
quence data makes them more amenable to such 
treatment than is the case with many more quali- 
tative biological characters. Most current schemes 
for constructing trees from sequences use a simple 
difference matrix, the elements of which are assem- 
bled by performing pairwise comparisons of all the 
sequences under study (Fitch and Margoliash' 1 967). 
A topology is found by classifying the sequences 
according to their differences, which ought to be a 
reflection of the evolutionary distances among them. 
For the most part, the principle of parsimony , is 
rigorously adhered to, and the best trees are thought 
to be those that can account for the extant sequences 
by the smallest number of genetic events. The two 
important features of a tree are its topology, or 
branching order, and its branch lengths, which ought 
to be proportional to the true evolutionary dis- 
tances^ 

In principle, the construction of an evolutionary 
tree based on sequence data ought to be a simple 
matter: all one has to do is cluster the sequences 
according to their similarities. In practice, uncer- 
tainties and ambiguities concerning both the topol- 
ogy and branch lengths are common, and enormous 
effort is often expended in finding the "best tree" 
(e.g.. Fitch 1977; Penny and Hendy 1986). Finding 
Ihe correct tree should depend on assembling a ma- 
trix that best describes the differences among the 
sequences, and this depends, in turn, on properly 
aligning the sequences (Hogeweg and Hesper 1984). 
The alignments can be obtained either by schemes 
that maximize similarity (Needleman and >yunsch 
1 970) or with those that minimize differences (Sell- 
ers 1974). If a similarity scheme is used, the scores 
must be transformed appropriately into measures 
of distance. 
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Ordinarily, alignments of either type are per- 
formed pairwise. The problem is that when the var- 
ious paired alignments are grouped, they are seldom 
consistent one to another. Thus, when sequence A 
is paired with sequence B, gaps may appear at var- 
ious locations, but when either A or B is aligned 
with a third sequence, C, the arrangement of gaps 
may be entirely different. Heretofore, this problem 
has been circumvented by making a multiple align- 
ment of all the sequences by the judicious shifting 
of the sequences as needed to minimize differences 
("eyeball" alignment). 

The flaw in the approach is that these multiple 
alignments have, like pairwise alignment schemes 
before them, been subject to rigorous attempts at 
parsimony. Obviously, the closer two sequences re- 
semble each other, the more confidence one has in 
the alignment. But in most multiple alignment 
schemes where maximum parsimony is sought, no 
distinction is made with regard to the confidence 
one has in a particular pairwise alignment. It seems 
to us folly that a gap should be discarded in an 
alignment of two closely related sequences merely 
because an alignment with some distantly related 
sequence might be improved. 

To this end, we have devised a scheme of pro- 
gressive sequence alignment that has a higher in- 
trinsic regard for recent events than for distant ones. 
It is still based on a maximization of similarities, 
but it follows the simple rule "once a gap, always a 
gap." It is able to accomplish this by inserting neu- 
tral elements into sequences once gaps have been 
established. The sequences are aligned progressive- 
ly, beginning with the most similar pair and con- 
tinuing with the addition of the next most similar 
sequence or set of sequences. The difference scores 
obtained from the final alignment of all sequences 
are then used to construct the evolutionary tree. 
Ambiguities may still arise, of course, since the pre- 
liminary matrix of similarities (or differences) based 
on pairwise comparisons will often include what we 
call "better but less reliable" scores. These can be 
sorted out by testing alternative trees. Because it is 
impractical to consider all possible pairwise orders, 
we have adopted an eflective compromise whereby 
reasonable alternative arrangements are explored 
progressively. 

In this paper we describe the details of the method 
and apply it to several groups of protein sequences. 
Trees constructed by this approach can differ sig- 
nificantly from those assembled by traditional 
schemes, but they are often in accord with what 
might be expected on the basis of organismic phy- 
logenies. The method has the added virtue of pro- 
viding multiple sequence alignments quickly and 
simply by completely objective criteria. 



Methods 

Studies were performed on a DEC 1 1/730 VAX computer with 
the UNIX (Berkeley 43) operating system. The plotting package 
for use with a Nicdlet Zeia plotter was written by Steve Dempsey 
of the U.C.S.D. Chemistry' Department Computer Center. All 
utility programs were written in the C programming language 
(Kemighan and Ritchie 1 978). The ensemble of programs dealing 
with sequence alignment and tree building can be contained by 
sending a blank magnetic tape to the authors. 

Definitions. For purposes of description only, we would like 
to distinguish between simple and compound trees. Simple trees 
are those in which the branching order follows the simple clus- 
.tering (((AB)C)D) etc., whereas compound trees have subclusters, 
as in ((AB)(CD)E). Neutral elements are simply characters (Xs) 
that are filled into sequences when gaps occur. They are neutral 
in the sense that they are invisible to the scoring system used to 
establish subsequent alignments, which is to say when X is matched 
with any other residue, the value is equal to zero. Negative seg- 
ments are those intemodal connecting distances with negative 
values that occasionally emerge from Fitch-Margoliash trees when 
data scatter confounds the segment -averaging (or least-squares 
treatment). Percent identity is taken as the number of identities 
per 100 aligned residues. 

Sequences. Amino acid sequences were taken from an updated 
version of the NEWAT database (Doolilile 1981). Primary ref> 
erences to the nine tyrosine kinase sequences and nine of the 
globin sequences have been provided in an earlier study (Feng 
et al. 1 985). The additional globins used in the present study are 
from lamprey (Zelenik et al. 1979) and the bacterium Vitreoscilla 
(Wakabayashi et al. 1986). The superoxide dismutase sequences 
studied are human (Jabusch et al. 1980), bovine (Steinman et al. 
1974), swordfish (Rocha et al. 1984), fruitfly (Lee et al. 1985), 
maize (Cannon et al. 1987), yeast (Johansen et al. 1979), and 
photobacter (Stelfens et al, 1983). 

Pairwise Alignments. The algorithm of Needleman and 
Wunsch (1970) was used in a three-matrix form (Fredman 1984) 
and utilized the Mutation Matrix of DayhofFet al. (1978) in its 
scoring. The algorithm was actually employed in several slightly 
different seuings. In the first, a program called SCORE aligns 
pairs of sequences in the conventional way and stores their align- 
ment scores in a table. The similarity scores obtained from the 
aiignrtients are converted to difference scores by the relationship 

D = -In X 100 - -In ~ ^""-'^ x 100 

where is the alignment score itself, is the score obtained 
with random sequences of the same lengths and compositions, 
and Sident is the average score of the two sequences being compared 
when each is aligned with itself. In practice, in these initial pair- 
wise comparisons we use an average value for based on 
many previous observations (Feng et al. 1 985). Inasmuch as this 
, initial set of comparisons is assumed to be imperfect, no precision 
is lost by the modification, and considerable time is saved by the 
omission of numerous jumble comparisons. The value used, after 
normalization to a standard length, was 770, the average random 
score for numerous comparisons of many different kinds of se- 
quences (Feng et al. 1985). 

The Needleman- Wunsch algorithm is used in. a second series 
of alignments in a mode in which gaps are concurrently filled 
with neutral elements. In the main version, DFalign, seqiiences 
are aligned successively. Should the tree in question be a com- 



pound tree, subclusters are first prealigned with a simpler version 
of the program called PREalign. 

Tree Building. A program based directly on the Fitch and 
Margoliash (1967) procedure was written in our laboratory by 
Mark Johnson. The program. BORD, was used to establish pre- 
liminary "branching orders. Simply put, the smallest difference 
score is identified and a new matrix constructed that contains 
the average distances between members of the first pair and re- 
maining members of the set. The procedure is repeated until all 
scores have been incorporated. A second program; BLEN, was 
used for determining branch lengths of the final tree. This pro- 
gram employs a least-squares approach as described by Klotz 
and Blanken (198 1). In the event that a tree contains one or more 
"negative segments," the "nearest alternative" trees are consid- 
ered and their scores compared. Nearest aliernative trees are those 
m which the branches immediately adjacent to a negative seg- 
ment are switched: The program TREEplot, also written by Mark 
Johnson, puts the data in an appropriate form for the Zeta plotter 
in order that dendrograms can be issued directly. 

Outline of the Progressive Method 

Pairwise Alignments 

For n sequences, the number of pairwise align- 
ments required for the initial matrix amounts to 
(n - 1) X n/2. To this end, a simple UNIX shell 
program was constructed for running each compar- 
ison serially with the program SCORE; the resulting 
difference scores are automatically stored in a suit- 
able file. 

Identification of Most Closely Related Pair 

The program BORD takes the output from SCORE 
and establishes a preliminary order of the sequences. 
The program BLEN uses the difference matrix from 
the SCORE program combined with a simple "con- 
nectivity table" to give branch lengths; the connec- 
tivity table merely puts all the connecting segments 
m tabular form. BLEN is only used at this point if 
trees based on pairwise comparisons are going to be 
prepared. The BORD program reveals whether or 
not the starting tree is simple or compound. In the 
case of compound trees, subclusters are prealigned 
with the program PREalign, which aligns the cluster 
and fills the gaps with neutral elements (Xs). 

Progressive Insertion of Neutral Elements 

The program DFalign, which is the heart of the 
procedure, is used to generate the multiple align- 
ment. It begins by inserting neutral elements (Xs) 
in any gaps that occur in the aligned pair with the 
highest similarity score. After the original pair has 
been established and the gaps fixed, the next nearest 
relative or set of relatives is brought in and a new 
alignment made and a score determined. The key 
to this alignment is that new gaps can be incorpo- 
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rated into either sequence, but the earlier gaps are 
preserved. The first temar>' arrangement, ABC, is 
then compared with the ahemative BAC, the higher 
score being used to set the path for the next align- 
ment. Similarly, when the next sequence is brought 
in, the arrangement ABCD is scored and compared 
with ABDC, Prealigned subclusters are maintained 
as separate units, however. The procedure is con- 
tinued until all sequences have been incorporated. ' 

Scoring the Final Alignment 

The final alignment is scored with a modified reg- 
inien tha^ recognizes* the fixed nature of the gaps. 
Moreover, because the gaps are fixed, it is unnec- 
essary to use an alignment program at this stage. 
Instead, a scoring system is used that measures S^^ai 
and Sidem in the usual way, but that employs a pro- 
gram, SHUFFLE, for determining S.^^^. SHUFFLE 
randomizes each sequence numeroiis times while 
holding the gaps constant. 

Constructing the Tree 

The program BORD is used to obtain the new 
branching order and the program BLEN to deter- 
mine the branch lengths. If any negative segments 
result, alternative trees with the branches on either 
side of the negative segment reversed are construct- 
ed and a new set of branch lengths calculated. If 
negative segments are still present, the alternating 
procedure is continued until they disappear, al- 
though we have not yet encountered a situation where 
more than one switch was necessary. The program 
TREEplot is used to produce the final dendrogram. 
A schematic outine of the programs called from start 
to finish is present in Fig. 1 . 

Results 

Superoxide Dismutase 

The sequences of seven copper-zinc superoxide dis- 
mutases-human, bovine, swordfish, fruitfly, maize, 
yeast, and photobacter— were subjected to a con- 
ventional pairwise alignment scheme and a tree con- 
structed by the Fitch arid Margohash (1967) pro- 
cedure (Fig. 2a). The same seven sequences were 
then treated by the progressive procedure and a tree 
generated (Fig. 2b). The trees differ both in branch 
order and branch length. More to the point, the 
progressive procedure yields a tree that corresponds 
to the accepted phylogeny of the organisms, whereas 
the conventionally generated tree does not. 

In fact, the initial tree issued from the ordinary 
Fitch and Margoliash (1967) treatment had the ex- 
pected phylogenetic branching order, but contained 
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a negative segment. When the nearest alternative 
tree was examined, generated by reversing the 
branches on either side of the negative segment, the 
sum of branch lengths was lowered, and a "better 
tree*' with no negative segments emerged (Fig. 2a). 
The tree contradicts what is known of the evolu- 
tionary relationships of the organisms involved, 
however, in that the branch to the yeast sequence 
comes off above the branch to the Drosophild se- 
quence. 



Progressive Alignment Procedure 



(Binary Mode:) 



Score 
BORD 

i 



BLEN 



I PREalign | 

(Progressive mode:) 
I DFalign 



I SHUFFLE I 

- I . 
IBORD I 

I BLEN I 



TREEplot I 



Fig. 1. Flow chart of progressive alignment procedure. Program 
names are shown in boxes. The program BLEN in upper portion 
of figure may be omitted if a tree based on pairwise alignments 
is not going to be constructed. 



It should be emphasized that the multiple align- 
ment (Fig. 3) used to obtain the final tree was ob- 
tained by strictly objective criteria and without re- 
course to "eyeball" manipulation. Moreover, the 
overall similarities, as reflected in the percent iden- 
tities, are more in line with the true distances sep- 
arating the organisms than are those observed in the 
original pairwise alignments (Table 1), 

Hemoglobins 

Eleven different globin sequences covering a broad 
spectrum of types were subjected to pairwise align- 
ments and an initial tree constructed from the re- 
sulting difference matrix (Fig. 4a). The tree was sim- 
ilar to those presented in previous reports in that 
cyclostome globins (hagfish and lamprey) branch off 
in advance of the myoglobin-hemoglobin a-chain 
divergence (Goodman et al. 1974; Hunt et al. 1978; 
Feng et ah 1 985). When the same 1 1 sequences were 
subjected to the progressive alignment procedure, 
the tree that emerged reversed the order to the more 
biologically reasonable situation in which the cy- 
clostome globins are clustered with those of other 
vertebrates (Fig. 4b). 

Also of interest are the relative positions of the 
plant and invertebrate hemoglobins. In the tree ob- 
tained from pairwise alignments, the plant and bac- 
terial hemoglobins appear to be more closely related 



Table 1. Percent identities calculated from binary (upper tri- 
angle) and progressive (lower triangle) alignment methods 

Superoxide dismuiases 
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sodb 


sodl 


sddm 


sdmz 


sods 


sdpb 


sodh 




82 


67 


60 


62 


53 


31 
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74 


57 


61 


55 


35 
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67 . 


72 




59 


59 


56 


35 


sddm 


59 


59 


58 




68 


54 


31 


sdmz 


60 


60 


58 


68 




57 


32 


sods 


51 


52 


54 


51 


54 




30 


sdpb 


31 


35 


34 


31 


34 


31 




sdmz- 



sdpb ' 




sdpb 



(a) 



(b) 



Fig. 2. Phylogenetic trees for seven 
superoxide dismiitases^ as determined 
by a simple pairwise alignments and 
b progressive multiple alignment. The 
four-letter designations are sodh, hu- ' 
man; sodb, bovine; sodl, swordfish; 
sddm, fruitfly; sods, yeast; sdmz, 
maize; sdpb, photobacter. The same 
designations are used in Fig. 3 and 
Table 1. 



sodh 
sodb 
sodl 
sddm 
.sdmz 
sods 
sdpb 



sodh 
sodb 
sodl 
sddm 
sdmz 
sods 
sdpb 



A?^vnvrS?SS^^'^'^^^^-^^^^^^°^^^^^^S^^GL^^^ GLHGFHVHQFG NDTAGCT 

v^^^u^vr^SSPSr^'''^^"^-^^ GDTWVTGSITGLTE GDHGFHVHQFG DNTQGCT 

m^^w2}^^^^^^^^^^^^^E<2^GNANAVGKGIILKGLTP GEHGFHVHGFG DNTNGCI 

i^^vfwr'^!^^ DAKGTVFFEQESSGTPVKVSGEVCGLAK GLHGFHVHEFG DNTNGCM 

VO^^^^XJ:^^^ DVKGTIFFSQEGDG .PTTVTGSISGLKP GLHGFHVHALG DTTKGCM 

VQAVAVLKGDAG VSGWKFEQASESEPTTVS YEIAGNSPNAERGFHIHEFG DATNGCV 



SAGPHFNP LSRK 
SAGPHFNP LSKK 
SAGPHFNP ASKK 
SSGPHFNP YGKE 
STGPHFNP VGKE 
SAGPHFNP FKKT 



ii=#IIt=lipii^^^^ 



hghu 



hbhu 



3- 



myhu - 



hbrl- 



gpfb- 



myhu 



hety- 




(b) 



Fig. 4. Phylogenetic trees for 1 1 glo- 
bin sequences as determined by a sim- 
ple pairwise alignments and b the mul- 
tiple alignments shown in Fig. 5. The 
four-letter designations are hghu, hu- 
man globin 7 chain; hbhu, human glo- 
bin /? chain; hahu, human globin a 
chain; heha, hagfish hemoglobin; hbri, 
lamprey hemoglobin; myhu, human 
myoglobin; mycr, gastropod myoglobin; 
hety, earthworm hemoglobin {Tylor- 
rhynchus); haew, earthworm hemoglo- 
bin (Lumbricus); gpfb, kidney bean 
leghemoglobin; hbvs, bacterial hemo- 
globin (Vitreoscilla). The same designa- 
tions are used in Fig. 5 and Table 2. 



to the globins of higher invertebrates and verte- 
brates than are those from annelid worms (Fig. 4a). 
Again, a more traditional grouping is^obtained with 
the progressive alignment procedure (Fig. 4b). The 
multiple alignment generated by the procedure (Fig. 
5) appears to be an accurate depiction of the history 
of events during globin evolution, and the degrees 
of similarity of the various globins based on these 
alignments are also more in line with expectations 
than are those found from simple binary alignments 
(Table 2). 

Tyrosine Kinase-like Sequences 

We had previously aligned a set of nine tyrosine 
kinase-like sequences and constructed a tree based 
on a simple pairwise matrix (Feng et al. 1985), and 
it was naturally of interest to see how the progressive 
alignment treatment compared (Table 3). In this 



case, unlike the situations with the superoxide dis- 
mutases and hemoglobins, the branching orders 
found by the two procedures did not differ (Fig. 6a 
and b). The multiple sequence alignment that was 
generated automatically during the procedure (Fig. 
7) was somewhat different from the "eyeball" align- 
ment made previously on the basis of a series of 
pairwise comparisons, ahhough the same 14 invar- 
iant residues occur coincidentally in both renditions 
(Fig. 7). The trees themselves are not significantly 
different, although the branch lengths differ slightly. 

Discussion 

The concept of using pairwise alignments iteratively 
to establish phylogenetic relationships is hardly new 
Moore et al. (1973) constructed the best possible 
dendrogram for a set of sequences by an iterative 
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hbhu 
hahu 
heha 
hbrl 
myhu 
mycr 
haew 
hety 
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hahu 
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hbrl 
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haew 
hety 
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GHFTEEDKATI 
VHLTPEEKSAV 
VLSPADKTNV 
PITDHGQPPTLS EG DKKA I 
PIVDSGSVAPLSAAEKTKI 
. GLSDGEWQLV 
SLQPASKSAL 



GKV NVEDAGGETLGRLLWYPWTQRFFDSFGNLSSASAIMGNPK VKAHGKKVLTSLG 
GK\' NVDEVGGEALGRLLWYPWTQRFFESFGDLSTPDA\T«1GNPK VKAHGKK^.'LGA^S 
GKVGAHAGEYGAEALERMFLSFPTTKTYFPHF DLSH GSAQ VKGHGKKVADALT " 

PQIYKNFEQNSLAVLLEFLKKFPKAQDSFPKFSAKKS HLEQDPA VKLQAEVIINAVN 
^^'^^?IjyS^^^^^^^''^^^'^^^P^Q^^^P'^f'KGMTSADQLKKSAD VRWHAERIINAVN 
rcci^vn^T ?^Y^^P!^S?SQ^^^^^^^^^"^^^^^^^^^™LKSEDEMKASED LKKHGATVLTALG 

— ASSWKTLAKDAATIQNNGATLFSLLFKQFPDTR.S'YFTHFGrJM SDAEMKTTGV GKAHSMAVFAGIG 

KKQCGVLEGLKVKSEWGRAYGSGHDREAFSQAIWRATFAQVPESRSLFKR VhSdHTSDPA FIAHA^v[^GLD 

TDCGIUJRILVLQQWAQVYSVGESRTDFAIDVFNNFFRTNPD RSLFNR ' VNGDNVYSPE ^^^^V^^GfS 
GAFTEKQEALVNSSW EAFK GNIPQYSWFYTSILEKAPAAKNLFSF LANGVDPTNPK LT^S^LFGLVR 

MLDQQTINIIKATV PVLK EHGVTITTTFYKNLFAKHPEVRPLFD J^R^SL^QpL^S^^V^QNIE 



TSLW 
TALW 
KAAW 
RESW 
RSAW 
LNVW 



DAIKHLD DLKGTFAQLSELHCDKLHVDPENFKLLGNVLVTVLAIHFGKEFTPEVQASWQKMV TGVASALS3RYH 
DGLAHLD NLKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAHHFGKEFTPPVQAAYQ^ AGVANaJISkYH 
NAVAHVD DMPNALSALSDUiAHKUlVDPVNFKLLSHCLLVTIJU^LPAEFTPAviASLDKF^ ASVSTV^i^^SK^R 

htiglmdkeaamkkylkdlstkhstefqvnpdmfkelsavfvstm ggkaayeklf s!i1?l^:^It^da 

DAVASMDDTEKMSMKLRDLSGKHAKSFQVDPQYFKVLAAVIADTV AAGDAGFEKL^ SlSlCILI^SAY 

GILKKKGHHE • AEIKPIAQSHATKHKIPVKYLEFISECIIQVI/^SKHPGDFGADAQGWWKAL ELFRKD^SNYKF rrPOr 
?^^c^^'n°''°''''''^'^^^^^'^"'^^^^^^^^^^^ MRQVFPNFLDEALGGGASGDVKGA^^L AY^^DN^A OA 

lAISTLDQPATLKEELDHLQVQHEGRKIPDNYFDA FKTAIUiWAAQIX5ERCYSNNEEIHDAIACDGFA^V^^^ ^KGHH 

ILISVLDDKPVLDQALAHYAAFH LQFGTIPFKA FGQTMFQTIAEHI HGADIGAWRAC YA EOT^^ ^TA 

DSAAQLRANGAWAD AALGSIHSQKGVSKDQFLV VKEALLKTLKQAV • GDKWTDQLS^ALE^ YD^tIL? ^^ava 

nlpailpavkkiavkhcqagvaaahypivgqelix;aikevlgdaatddi . lS^^viad^ f?q^La^^e 
SenLl rt' nlf ^r"' ' ' determined by progressive method. Asterisks denote locations where all 1 1 residues are 

letter Siia^^^^^^^^ """''^ ^^'"^ °" permutative trial described in the text. See legend to Fig. 4 for four- 
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Peroent identities calculated from binary (upper triangle) and progressive (lower triangle) alignment methods 
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pairwise process, and, more recently, Hogeweg and 
Hesper (1984) used a heuristic approach for gen- 
erating trees that also depends on successive pair- 
wise alignments. As far as we know, however, the 
notion of "once a gap, always a gap," coupled with 
progressive pairwise alignment, has not been uti- 
lized before. Gap preservation is achieved by the 
insertion of neutral elements that hold the gap po-. 
sitions fixed during each progressive realignment. 

Two things are certain: the method, while heu- 
ristic, provides multiple sequence alignments that 
are based on objective criteria, and trees derived 
from these alignments appear to be in harmony with 
the biology of the proteins as evidenced by the phy- 
logeny of the organisms from which they are ob- 
tained. The simplicity of the procedure is attested 
to by the small number of pairwise comparisons that 
must be undertaken to produce the multiple align- 
ment (Table 4). Thus, if 10 sequences are to be 
aligned, only 6 1 comparisons have to be made. This 



is a smaller number of alignments than is ordinarily 
performed when a set of jumbles is made for a single 
quantitative alignment. In this regard, we have es- 
chewed the use of jumbled comparisons in the initial 
ahgnments in favor of an empirically determined 
average random score. 

Kinds of Sequence Alignment 

Broadly speaking, there are three kinds of multiple 
sequence alignment: (1) structural equivalence types, 
(2) global optimization methods, and (3) historical 
alignments. The first of these, structural equiva- 
lence's used mainly by crystallographers. The goal 
is to align those segments of two protein sequences 
that occupy equivalent three-dimensional orienta- 
tions. As such, these studies are usually restricted 
to protein families at least one member of which 
has had an x-ray structure determined (Bajaj and 
Blundell 1984). The interest is focused on present- 
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Fig. 6. Phylogenetic trees for nine ty- 
rosine kinase-like sequences determined 
from a simple pairwise alignments and 
b progressive alignment. The four-letter 
designations are v-src, avian Rous sar- 
coma virus transforming factor; v-yes, 
avian Y73 sarcoma virus transforming 
factor; v-abl, Abelson murine leukemia 
virus transforming factor; v-fes, feline 
sarcoma virus transforming factor; 
v-fps, avian Fujinami virus transform- 
ing factor; v-raf, murine retroviral 
transforming factor; v-mos, mouse sar- 
coma virus transforming factor; cdc28, 
yeast cell division control factor; cadk, 
bovine cyclic AMP-dependent kinase. ' 
The same four-letter designations are 
used in Fig. 7 and Table 3. 
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day structure without regand for how the structures 
came to be. 

Global optimization methods are designed to ac- 
commodate a set of sequences in a multiple align- 
ment that maximizes overall similarity. Three-di- 
mensional extensions of the Needleman-Wunsch 
algorithm, for exainple, have been used to achieve 
such alignments (Jueetal. 1980;Murataetal. 1985) 
and Johnson and Doolittle (1986) have used the 



overlapping approach pioneered by Fitch (1966, 
1 970) to generate four-way and five-way alignments.' 
Again, these alignments are made without regard to 
historical detail. 

Historical alignments are based on the notion 
that divergent evolution is fundamentally binary in 
nature. Long ago Dayhoff et al. (1972), noting that 
matrix methods greatly foreshorten the more an- 
cient branches in evolutionary trees, used a common- 
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Table 3. Percent identities calculated from binar>' (upper triangle) and progressive (lower triangle) alignment methods 
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Table 4. Numbers of pairwise alignments required to construct 
a phylogenetic tree by a progressive method" 





Initial 


Additional 




Number of 


pairwise 


iterative 




sequences 


alignments 


alignments 


Total 


3 


3 


2 


5 


4 


6 


4 


10 


5 


10 


6 


16 


6 


15 


8 


23 


7 


21 


10 


31 


8 


28 


12 


40 


9 


36 


14 


50 


10 


45 


16 


61 . 


11 


55 


18 


73 



* Values are minimal numbers for simple trees; compound trees 
need an additional alignment for each subclusler. Also, occa- 
sional negative segments in some trees will necessitate addi- 
tional alignments 



ancestor approach to alignment and tree building 
that was historical in principle. The character-based 
approach that they used was much clumsier than 
matrix methods, however, and eventually was aban- 
doned. Subsequently, Holmquist (1979, p 939) 
drew attention to the fact that parsimony methods 
err significantly, "the magnitude of the error in- 
creasing with the distance of the nodal sequence 
from the present," and, more recently. Penny and 
Hendy (1986) have expounded on the theme that 
the minimal tree cannot be the historical tree. 

It is obvious that methods based on mere global 
optimization will consistently underestimate evo- 
lutionary distances among the least related members 
of the set, striving as they do to achieve maximum 
alignment scores. The need is to throttle the ten- 
dency for optimization while preserving the notion 
of similar residues replacing one another. The pro- 
gressive alignment procedure presented here ap- 
pears to achieve that end. In its favor, the trees 
generated from these alignments appear to be in 
accord with biological expectations. 



Superoxide Dismutase Relationships 

The copper-zinc superoxide dismutase sequences 
have been the subject of much debate since the pos- 
sibility was raised that the sequence found in the 
prokaryote Photobactehum leiognathi might be the 
result of a horizontal gene transfer from its ponyfish 
host (Martin and Fridovich 1981). Although solid 
evidence to the contrary was provided by Steffens 
et ai. (1983), the notion has refused to go away (Ban- 
nister and Parker 1985), Our thinking about this 
matter is wholly in accord with that recently ex- 
pressed by Leunissen and De Jong (1986): to wit, 
there is no basis for supposing ianything other than 
a conventional history of events. Indeed, either of 
the evolutionary trees in Fig, 2 ought to dispel 
thoughts of a horizontal gene transfer for this gene, 
the photobacter position being entirely consistent 
with what would be expected for a typical prokary- 
otic-eukaryotic divergence. On the other hand, the 
tree made from pairwise alignments (2a) does have 
an unreasonble arrangement for the fruitfly and yeast, 
whereas the progressive tree is quite in line with 
conventional phylogeny. 

It should be pointed out in passing that an ap- 
parent speed-up in the rate of copper-zinc super- 
oxide dismutase evolution has occurred among the 
vertebrates (Lee et al. 1985). Thus, the apparent 
differences between mammalian and Drosophila se- 
quences are much greater than would be expected 
on the basis of a comparison of the Drosophila and 
yeast sequences. The fact that there appears to have 
been a relaxation of selection pressures on the ver- 
tebrate superoxide dismutase should not affect the 
branching order, , of course. 



Hemoglobins and Myoglobins 

The progressive alignment scheme also yields rea- 
sonable results when applied to distantly related glo- 
bin sequences. In contrast to phylogenies employing 
a maximum parsimony method (Goodman et al. 
1974), the progressive method roots the lamprey 



and hagfish globins to the same branch as other 
vertebrate hemoglobins. Interestingly, an early study 
employing the common ancestor approach (Dayhoff 
and Eck 1968) also had the lamprey in this position. 
With regard to the relationship of animal and plant 
globins, the depth of the differences warrants a good 
deal of caution. Nonetheless, the recently published 
bacterial globin sequence (Wakabayashi et al, 1986) 
resembles the plant globins more than it does the 
animal globins, and it is not impossible that an un- 
usual genetic event involving plants and symbiotic 
bacteria has occurred. A larger study encompassing 
all the known invertebrate and plant globin se- 
quences may reveal more about the evolutionary 
connections of these proteins. 



Concluding Remark 

It is not our intention to reopen past skirmishing 
about the relative merits of strict parsimony meth- 
ods and alternative treatments (Fitch 1981; Holm- 
quist and Jukes 1981). Nor is it our aim merely to 
add one more comment to the enormous literature 
on the construction of evolutionary trees with se- 
quence data (Tateno et al. 1982; Hogeweg and Hes- 
per 1984; Penny and Hendy 1986, to name but a 
few). Rather, we simply offer a heuristic procedure 
for a computer-determined multiple alignment of 
related amino acid sequences that can be effected 
rapidly by objective criteria. Evolutionary trees 
drawn directly from these alignments appear to be 
very much in accord with biological expectations. 
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Note Added in Proof. During the period since the acceptance of 
this article we have appUed the procedure in numerous settings, 
and, m some cases, the final alignment was slightly imperfect! 
The situation was remedied, however, by aUgning each new se- 
quence, or set of sequences, with an average sequence of all the 
sequences already aligned. This was accomplished by simply 
looking up the matrix value for every pair of residues at each 
position and averaging them. We are grateful to Steve Hanks for 
bringing the problem to our attention and to Mark Johnson for 
helping with the solution. 
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ABSTRACT 

The University of Wisconsin Genetics Computer Group (UWGCG) has been 
organized to develop computational tools for the analysis and publication of 
biological sequence data. A group of programs that will interact with each 
other has been developed for the Digital Equipment Corporation VAX computer 
using the VMS operating system. The programs available and the conditions for 
transfer are described. 

INTRODUCTION 

The rapid advances in the field of molecular genetics and DNA sequencing 
have made it imperative for many laboratories to use computers to analyze and 
manage sequence data. UWGCG was founded when it became clear to several 
faculty members at the University of Wisconsin that the there was no set of 
sequence analysis programs that could be used together as a coherent system 
and be modified easily in response to new ideas. 

With intramural support a computer group was organized to build a strong 
foundation of software upon which future programs in molecular genetics could 
be based. This initial project has been completed and the resulting programs, 
written in Fortran 77, are available for VAX computers using the VMS operating 
system. Most of the programs can be used with only a terminal, although 
several require a Hewlett Packard plotter. 

UWGCG software has been installed for testing at eight different 
institutions. A simple method has been developed for transferring and 
maintaining this system on other VAX computers • 

DESIGN PRINCIPLES 

UWGCG program design is based on the '^software tools" approach of 
Kernighan and Plauger(l). Each program performs a simple function and is easy 
to use. The programs can be used independently in different combinations so 
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that complex problems are solved by the use of several programs in succession. 
New programming is simplified since less effort is required to bridge a gap 
between existing programs. 

UWGCG software is designed to be maintained and modified at sites other 
than the University of Wisconsin. The program manual is extensive and the 
source codes are organized to make modification convenient- Scientists using 
UWGCG software are encouraged to use existing programs as a framework for 
developing new ones. Our copyright can be removed from any program modified 
by more than 25% of our original effort. 

PROGRAMS AVAILABLE FROM UWGCG 

The programs described below are named and defined individually in Table I. 

Program names in the text are underlined. 

Comparisons 

Comparisons may be done with "dot plots" using the method of Maizel and 
Lenk(2). Optimal alignments can be generated by the methods of Needleman and 
WunschO), of Sellers(4), and the "local homology" method of Smith and 
WatermanO). The Smith and Waterman alignment algorithm is also the most 
sensitive method available for identifying similarities between weakly related 
sequences. 

Mapping and Searching 

Mapping is available in several formats. Graphic maps display all of the 
cuts for each restriction enzyme on parallel lines. This graphic map 
facilitates selection of enzymes for isolating any region of a sequenced DNA 
molecule. Sorted maps in tabular format arrange the fragments from any 
digestion in order of molecular weight to show which fragments are similar in 
size and thus likely to be confused in gels. Another frequently used mapping 
format, designed by Frederick Blattner(6), displays the enzyme cuts above the 
original DNA sequence. Both strands of the DNA and all six frames of 
translation are shown. 

All mapping programs will search for user-specified sequences, allowing 
features to be marked at the appropriate position on a restriction map. The 
mapping and searching programs can be used to aid site-specific mutagenesis 
experiments by showing where mutations could generate new restriction sites. 
All of the positions in a sequence where a synthetic probe could pair with one 
or more mismatches can also be located. Sequences related to less precisely 
defined features such as promoters or intervening sequence splice sites, can 
be located with a program that uses a consensus sequence as a probe. The 
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Table 1 



Programs Available from UWGCG 



Name 



Function 



DotPlot+ 
Gap 

BestFit 

MapPlot+ 

MapSort 

Map 

Consensus 
FitConsensus 

Find 

Stemloop 
Fold* 



CodonPreference+ 

CodonFrequency 
Correspond 

TestCode"*" 

Frame 

PlotStatistics"*" 
Composition 
Repeat 
Fingerprint 

Seqed 

Assemble 

Shuffle 

Reverse 

Ref omiat 

Translate 

BackTranslate 

Spew 

GetSeq 

Crypt 

Simplify 



Publish 
Poster**" 
OverPrint 



makes a dot plot by method of Maizel and Lenk(2) 

finds optimal alignment by method of Needleman and Wunsch(3) 

finds optimal alignment by method of Smith and Waterman(5) 

shows restriction map for each enzyme graphically 

tabulates maps sorted by fragment position and size 

displays restriction sites and protein translations above 

and below the original sequence(Blattner , 6) 

creates a consensus table from pre-aligned sequences 

finds sequences similar to a consensus sequence using a 

consensus table as a probe 

finds sites specified interactively 

finds all possible stems (inverted repeats) and loops 
finds an RNA secondary structure of minimtim free energy 
by the method of Zuker(7) 

plots the similarity between the codon choices in each 
reading frame and a codon frequency table (8) 
tabulates codon frequencies 

finds similar patterns of codon choice by comparing 
codon frequency tables (Grantham et al,9) 
finds possible coding regions by plotting 
the "TestCode" statistic of Fickett(lO) 
plots rare codons and open reading frames (8) 
plots asynmietries of composition for one strand 
measures composition, di and trinucleotide frequencies 
finds repeats (direct, not inverted) 

shows the labelled fragments expected for an RNA fingerprint 

screen oriented sequence editor for entering, editing 
and checking sequences 

joins sequences together 

randomizes a sequence maintaining composition 
reverses and/or complements a sequence 
converts a sequence file from one format to another 
translates a nucleotide into a peptide sequence 
translates a peptide into a nucleotide sequence 
sends a sequence to another computer 
accepts a sequence from another computer 
encrypts a file for access only by password 
substitutes one of six chemically similar amino acid 
families for each residue in a peptide sequence 

arranges sequences for publication 

plots text (for labelling figures and posters) 

prints darkened text for figures with a daisy wheel printer 



+ requires a Hewlett Packard Series 7221 terminal plotter 
* Fold is distributed by Dr. Michael Zuker not UWGCG. 
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mapping programs can also be used on protein sequences to identify the 
peptides resulting from proteolytic cleavage. 
Secondary Structure 

Three programs are available to examine secondary structure in nucleic 
acids. The program StemLoop identifies all inverted repeats. An 
implementation of Dr. Michael Zuker's Fold program(7) finds an RNA secondary 
structure of minimum free energy based on published values of stacking and 
loop destabilizing energies. The "dot plot" comparison (mentioned above) of a 
sequence compared to its opposite strand gives a graphic picture of the 
pattern of inverted repeats in a sequence. 

Analysis of Composition and the Location of Genetic Domains 

Regions of a sequence with non-random base distribution can be displayed 
with three graphic tools designed to identify genetic domains. The program 
CodonPref erence ( 8) identifies potential coding regions by searching through 
each reading frame for a pattern of preferred codon choices. The 
CodonPref erence plot predicts the level of translational expression of mRNAs 
and helps identify frame shifts in DNA sequence data. Patterns of codon 
choice can be compared with the program Correspond ( 9) . When a strong pattern 
of codon preferences is not expected, the "TestCode" statistic of Fickett(lO) 
can be plotted to show regions of compositional constraint at every third 
base. Another program plots asymmetries of composition by strand. Strand 
asymmetries have been associated with genetic domains by several 
authors ( 11 )( 1 2) . A fourth program called Frame marks the positions of rare 
codons and open reading frames on a graph showing all six reading frames. 

Several tools are available to measure content and to count dinucleotide , 
trinucleotide, neighbor and repeat frequencies. A program that predicts RNA 
fingerprint patterns and another that tabulates codon frequencies complete the 
group of programs that analyze composition. 
Sequence Manipulation 

Sequences may be entered, assembled, edited, " reversed, randomized, 
reformatted, translated, back-translated, documented, transferred, or 
encrypted rapidly with a large set of sequence manipulation tools. 

A screen-oriented editor is available that allows sequences to be entered 
and checked. After a sequence is entered, it may be reentered for 
proofreading. Whenever a reentered base is at variance with the original, the 
terminal bell rings and the position is marked. Existing sequences can be 
edited quickly by moving directly to a sequence position specified by either a 
coordinate or a sequence pattern. The program can reassign the terminal's 
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keys to place G, A, T and C conveniently under the fingers of one hand in the 
same order as the lanes of a sequencing gel. 

Programs are available for changing sequence file format. Sequence data 
from any source can be used in UWGCG programs, and sequence files maintained 
with UWGCG software can be converted for use in other non-UWGCG programs. For 
instance, the programs of Roger Staden(13) or Intelligene tics Inc. (14) could 
be used to assemble a sequence from the sequences of many small sub-fragments 
generated by DNAase I digestion. The assembled sequence could then be 
reformatted for use in any UWGCG program. A program is available that 
transfers sequences to and from other computers. 
Sequence Publication 

A program, Publish , will format sequences into figures. Publish has 
alternatives for line size, numbering, scaling, translation and comparison to 
other sequences. Poster is a program that will plot text on figures. 

GENERAL FEATURES OF UWGCG SOFTWARE 
Interactive Style 

Each program is run by simply typing its name. Every parameter required 
by the program is obtained interactively. Questions are answered with a file 
name, a yes, a no, a number, or a letter from a menu. Default answers are 
displayed. Programs are insensitive to absurd answers and will ask the 
question again if, for instance, you name a file that does not exist or if you 
use a nonnumeric character when typing a number. Special features such as 
plotting features oriented to publication^ are obtained by using an extra word 
next to the program's name when the program is run. Thus parameter queries 
are kept to a minimum for the normal use of each program. 
Data 

Both the NIH-GenBank(15) and the EMBL(16) nucleotide sequence data 
libraries are available "on-line" to any UWGCG program. A Search utility will 
locate sequences in the libraries by key word. A Find utility will locate 
library entries containing any specified sequence. A program is available 
that installs the new data sent periodically from GenBank and EMBL to update 
their data libraries. 

All of the data in the system are stored in text files that can be read 
and modified easily. Every data file has an English heading describing the 
contents. The data files may be copied by each user for analysis or 
modification. Programs recognize and read user-modified input data 
automatically. Data files can be modified with any text editor. 
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Sequence File Structure 

Sequences are maintained in files that allow documentation and numbering 
both above and within the sequence. This file format is compatible with both 
of the nucleic acid sequence libraries and has been adopted as the standard 
sequence file format by the data base project at the European Molecular 
Biology Lab. Because genetic manipulations commonly involve linking several 
molecules of known sequence, UWGCG sequence files are designed to support 
concatenation by allowing comments to appear within the sequences at any 
location. Coding sequences or the boundaries between cloning vector and 
insert, for instance, can be marked within the sequence itself for immediate 
identification. 
Sequence Symbols 

All possible nucleotide ambiguities and all standard one-letter amino 
acid codes are part of the UWGCG symbol set that includes all alphabetic 
characters plus five additional characters. The proposed lUB-IUPAC standard 
nucleotide ambiguity symbols(17) are used for the mapping, searching and 
comparison programs. Lower case characters are used in sequences to indicate 
uncertainty as distinct from ambiguity. This allows the entire lexicon of 
symbols to be reused with same meaning, but with the prefix "maybe-." This 
reuse of the symbol set in lower case makes the uncertainty symbols more 
complete, understandable and visible. 
Symbol Comparison 

Sequence analysis programs generally make comparisons between sequence 
symbols (bases or amino acids) in order to find enzyme sites, create 
alignments, locate inverted repeats etc. These symbol comparisons are handled 
in several ways. 

Sjrmbol comparisons for alignment, comparison and secondary structure 
analysis are made by looking up a value in a symbol comparison table for the 
quality of the match. The table might contain I's for matches and O's for 
mismatches. If amino acids are being compared, however, a real number could 
be assigned at each position based on some previously assigned chemical 
similarity of the pair of residues or on the mutational distance between their 
codons. Standard symbol tables are provided by UWGCG, but the system is 
designed to allow each user to specify his own values. 

Symbols comparisons for mapping and searching operations in nucleic acids 
are made by converting the lUB-IUPAC symbols into a binary code. The bits of 
this code represent G, A, T and C with ambiguity symbols causing more than one 
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bit to be set. A group of library functions identify overlap between the bits 

for each lUB-IUPAC symbol. 

Documentation 

Documentation is available both in printed form and on the terminal 
screen. A 350 page manual describes the operatign of each program in detail, 
gives practical considerations and shows what will appear on the screen during 
a session with the program. Output files and plots are shown for the session. 
The data for the session shown in the documentation are included with the 
system so that the each program's operation can be checked. The "on-line" 
documentation is the same as the manual, but can be changed immediately when a 
program is modified. 

All programs write output to files that are completely documented and 
sensibly organized for input to other programs. The input data, the program 
and the parameters used are clearly identified in every output file. 
Procedure Library 

UWGCG programs are written largely as calls to a library of 250 
procedures designed to manipulate biological sequences. These procedures use 
data and file structures which have been designed to simplify program 
modification. For instance, standard operations such as reading, sequences 
from files are always handled by a single library procedure. Thus a change in 
sequence file format requires only one subroutine to be modified for the new 
format to be acceptable to all of the programs in the system. Command 
procedures are available to help modify the library. The procedure library 
can be used by programs written in any language. 

DISTRIBUTION OF UWGCG SOFTWARE 
Intent 

The intent of UWGCG is to make its software available at the lowest 
possible cost to as many scientists as possible. 
Fees 

A fee of $2,000 for non-profit institutions or $4,000 for industries is 
being charged for a tape and documentation for each computer on which UWGCG 
software is installed. While no continuing fee is required, UWGCG software, 
like the field it supports, is changing very rapidly. A consortium of 
industries and academic laboratories is planned to support the project in the 
future. The consortium will entitle its members to periodic updates and to 
influence the direction of new programming undertaken by UWGCG in return for a 
pledge of continuing financial support. 
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Copyrights 

UWGCG retains the copyrights to all of its software and UWGCG must be 
contacted before all or any part of the its software package is copied or 
transferred to any machine. UWGCG is, however, mandated to provide research 
tools to help scientists working in the area of molecular genetics and we are 
glad to see our source codes become the basis of further programming efforts 
by other scientists. Copyright can be removed for any program modified by 
more than 25% of its original effort. 
Tape Format 

The UWGCG package is usually distributed in VAX/VMS "backup" format on a 

9 track magnetic tape recorded at 1600 bits/inch. The system consists of 
about 1000 files using about 20,000 blocks at 512 bytes/block. The current 
versions of the GenBank and EMBL nucleotide sequence data bases are normally 
included which add another 3,000 files and require another 20,000 blocks. 

Upon request UWGCG will make a card image tape of all of the Fortran 77 
programs and procedures for reading on computers other than the VAX. The card 
image tape is usually provided at 1600 bits/inch with 80 characters/record and 

10 records/block. Adaptation of UWGCG software to systems other than VAX/VMS 
may take considerable effort. 

Equipment Required 

UWGCG programs and command procedures will run on a Digital Equipment 
Corporation (DEC) VAX computer that is using version 3.0 or greater of the DEC 
VMS operating system. A tape drive is necessary; a floating point accelerator 
and a DEC Fortran compiler are helpful, but not required. All programs can be 
run from a DEC VT52 or VTIOO terminal. Seven programs, as noted in table 1, 
require a Hewlett Packard 7221 terminal plotter wired in series with the 
terminal. Several utilities support a daisy wheel compatible printer attached 
to the terminal's pass-through port, however, all programs write output files 
suitable for printing on any standard device. 
Inquiries 

Inquiries may be sent to John Devereux at the Laboratory of Genetics, 
University of Wisconsin, Madison, WI, USA 53706, (608) 263-8970. UWGCG is not 
licensed to distribute Fold(7), but the UWGCG implementation is available from 
Michael Zuker, Division of Biological Sciences, National Research Council of 
Canada, 100 Sussex Drive, Ottawa, Canada, KIA 0R6 (613) 992-4182. 
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A BSTRACT , , ^ r^vT^ 

Th^ 5 '-untranslated leader sequences of several plant RNA 

viruses, and a portion of the 5 '-leader of an animal retrovirus, 
were tested for their ability to enhance expression of contigu- 
ous open reading frames for chloramphenicol acetyltransf erase 
(CAT) or B-qlucuronidase (GUS) in tobacco mesophyll protoplasts. 
Esche richia coli and oocytes of Xenopus laevis . Translation of 
capped or uncapp ed transcripts was substantially enhanced in 
almost all systems by the leader sequence of either the Ul or 
SPS strain of TMV. All leader sequences, except that of TYMV, 
stimulated expression of 5 '-capped GUS mRNA v^ith the native 
prokaryotic initiation codon context, in electroporated proto- 
plasts, only the TMV leaders enhanced translation of uncapped 
GUS mRNAs in protoplasts and increased expression of uncapped 
CAT mRNA in microin jected X. laevis oocytes. In oocytes, the 
TYMV leader sequence was inhibitory. 

in transformed E. coli , the TMV-Ul leader enhanced express- 
ion of both the native and eukaryotic context forms of GUS mRNA 
about 7.5-fold, despite the absence of a Shine-Dalgarno region 
in any of the transcripts. The absolute levels of GUS^ activity 
were all about 6-fold higher with mRNAs containing /be native 
initiation codon context. In E. coli , the leaders of AlMV RNA4 
and TYMV were moderately stimulatory whereas those of BMV RNA3, 
RSV and the SPS strain of TMV enhanced GUS expression by only 2- 
to 3-fold. 

INTRODUCTION 

Cis-acting features which influence the selection and trans- 
lati^of eukaryotic mRNAs are poorly understood. Surveys of 
sequences upstream from the AUG start codon have failed to 
identify a universal consensus sequence which might act as the 
eukaryotic equivalent of the prokaryotic Shine-Dalgarno region, 
a region essential for the expression of prokaryotic mRNAs in 
E. coli (1). secondary structures within the 5 '-untranslated 
leaders of some eukaryotic mRNAs have been claimed to promote 
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(2,3) or inhibit (4) translation initiation. In prokaryotic 
mRNAs, selection of start codons may also be influenced, in 
part, by low surrounding secondary structure (5). In the 
relaxed scanning model (6), 40S ribosomal subunits bind at the 
5 '-end of an mRNA and scan until the first AUG in the optimal 
context (5'-ACCAUGG-3') is reached, at which point translation 
begins. Beyond this, little is known about the longer-range 
effects of specific sequences on expression of eukaryotic mRNA. 

we have shown that translation of prokaryotic (7) and 
eukaryotic (8) mRNAs is greatly enhanced by a contiguous 
derivative of the 68-nucleotide , 5 '-leader sequence of tobacco 
mosaic virus (TMV), Ul strain (called Omega (/t); 9,10). The 
stimulatory effect of this J\- -like sequence (referred to as 
Jl'-Ul) has been observed in vitro and in vivo , in both eukaryotic 
and prokaryotic translation systems. Tyc and co-workers (11) 
identified a second BOS ribosome binding site, centred on 
residues 14-16 (AUU) within Jl-Ul (or^'), which was upstream of, 
and in frame with, the predicted ribosome binding site at the 
first AUG codon (residues 68-70 in Jl-Ul ) . The latter initiates 
synthesis of the 126,000 dalton (126Kd) protein encoded by TMV 
RNA. m Jl-Ul (and Jl'-Ul), 51 nucleotides separate the AUU and 
AUG sequences which, in the presence of an inhibitor of elonga- 
tion (sparsomycin), permit two ribosomes to bind simultaneously 
without steric hindrance. Initiation of translation of genomic 
TMV RNA under these conditions has been claimed to result in two 
unique dipeptides, Met-Thr and Met-Ala, (12) which may arise by 
illegitimate or legitimate initiation at the AUU and AUG sites, 
respectively. Yokoe and coworkers (13) demonstrated RNA-RNA 
hybridization between the 5'-region of Jl-Ul, containing the AUU 
sequence, and the 3'-terminus of wheat germ 18S rRNA, again 
supporting the possibility of disome formation. In addition to 
TMV, several other viral RNA leader sequences have been shown to 
form disome (or even trisome) structures (2, 14-16). The 
36-nucleotide leader of AlMV RNA4 binds only one ribosome (17), 
nevertheless it will stimulate expression of contiguous foreign 
gene transcripts in vitro (18). 

we wished to determine whether translational enhancement was 
a general feature of 5 '-untranslated viral leader sequences and 
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if the ability of a viral leader to form disomes could be 
correlated witb its ability to enhance translation of a 
contiguous open reading frame. For this purpose, synthetic 
oligonucleotide sequences derived from the 5 '-leaders of TMV (Ul 
strain; disome), TMV (SPS strain; disome), turnip yellow nvosaic 
virus (TYMV; disome), alfalfa mosaic virus (AlMV) RNA 4 (mono- 
some), brome mosaic virus (BMV) RNA 3 (disome), and the animal 
retrovirus, Rous sarcoma virus (RSV; disome) were analyzed for 
their relative abilities to stimulate expression of convenient 
reporter gene transcripts in vivo . 

MATERIALS AND METHODS 

Bacterial strains, plasmids, enzy mes, and media 

Escherichia coli strains HBlOl and JMlOl were obtained from 
F. Bolivar and J. Messing, respectively. The pSP64 derivatives 
pJIIl, pJIIlOl, pJIl2, PJII102 have been described (7). The 
chloramphenicol acetyltransf erase (CAT) reporter gene from Tn9 
was obtained from T.J. Close (CSIRO, Canberra, Australia). The 
p-glucuronidase gene (GUS) and its derivatives were obtained 
from R. Jefferson and M. Bevan (Plant Breeding Institute, Maris 
Lane, Trumpington, Cambridge). SP6 RNA-polymerase , human plac- 
ental RNase inhibitor, DNA polymerase I (Klenow fragment), T4 
DNA ligase and all restriction endonucleases were purchased from 
Boehringer (Mannheim), Pharmacia Ltd., or New England BioLabs. 
Purified CAT was bought from Pharmacia Ltd. SOC medium (19) was 
used to prepare competent E. coli cells, and L-broth (20) was 
used for all other cultures. 
Plasmid DNA purification and manipulation 

preparative scale (21) and small scale (22) DNA isolations 
were as described. Standard DNA manipulations were performed 
essentially as described (21). 
Oligodeoxyribonucleotide synthesis 

Oligodeoxyribonucleotides were synthesized by S. Gilmore 
and A.J. Northrop (Institute of Animal Physiology, Babraham, 
Cambridge) using a Biosearch 8600 4-channel DNA synthesizer and 
the 3-cyanoethyl-phosphoramidite method (23). For each full- 
length dsDNA viral leader, one complete strand (the coding 
strand) was synthesized with a 5'-HindIII site. (+1 base) and a 
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3'- sail site (+ 1 base), for subsequent insertion into tlie 
transcription plasmid pJIIl. A second complementary oligodeoxy- 
ribonucleotide (24-mer) was then annealed, and the dsDNA filled- 
in by polymerization with either DNA polymerase I (Klenow frag- 
ment) or reverse transcriptase. 

construction of trp promoter plasmid pJII168 fo r E. coli 
trans formation 

A 90 base pair (bp) Hindlll/BamHI fragment containing the 
tryptophan (trg) promoter (P-L Biochemicals , Inc.) was intro- 
duced into the Hindlll/BamHI sites of pJIH (7), from which a 
400bp BamHI fragment containing the TMV origin-of -assembly 
sequence had been removed. The Hindlll site upstream from the 
tr£ promoter was removed by digestion with Hind III , filling-in 
with DNA polymerase I (Klenow fragment), followed by re-liga 
tion. A Hindlll site was then introduced at the 3 ' -end of the 
promoter by replacing the 25bp Hpal/Sall fragment with a 
synthetic 17bp H£al/Sall fragment, containing a Hindlll site 
positioned at the transcription start site. 
RNA synthesis 

in vitro transcription of linearized plasmid DNAs was 
carried out using bacteriophage SP6 RNA polymerase (24). Capped 
transcripts were obtained by modifying the published reaction 
conditions to include 200hM GTP and 1 . 5mM 'pppSc (Pharmacxa, 
Ltd). RNAs were quantitated either by trace-labelling with oC- 
Q32p]_rUTP or by f ormaldehyde-agarose gel electrophoresis as 
described (24). 

Preparation and electroporation o f tobacco mesophyll protoplasts 
Mesophyll protoplasts were isolated from leaves of 

Nicotiana tabacum (cv. Xanthi) and stored in 0.7M mannitol (25). 

Electroporation of RNA into protoplasts and incubations were 

carried out as previously described (7). 

After incubation, electroporated protoplasts were sedimen- 

ted, resuspended and broken by ten passages through a 26-gauge 
needle in 400pl of 0.25M Tris-HCl. pH 7.4, containing lOmM dxth- 
iothreitol (DTT). Extracts were microcentrif uged at 10,000x2 
for 10 min at 4''C. 

Microinjection of Xenopus laevi s oocytes 

X. laevis were purchased from Xenopus Ltd., South Nuffield, 



8696 



Nucleic Acids Research 



U.K. Two ng of each syntlietic uncapped SP6 mRNA were injected 
into the cytoplasm of stage 6 oocytes in batches of 25 using 
standard procedures (26). Oocytes were incubated for 21 hours 
in Modified Earth's Saline, then washed briefly in distilled 
water. Extracts from Xenopus oocytes were prepared by resus- 
pending each sample in 0.25M Tris-HCl, pH 7.4, lOmM DTT (20pl/ 
oocyte), followed by sonication for 10 sec. Insoluble material 
was removed by microcentrifugation for 15 min and fractions of 

the supernatant representing equivalent numbers of oocytes were 

assayed for CAT activity. 

CAT assay 

The protein concentration of each supernatant from 
laevis oocytes or tobacco protoplasts was determined by the 
method of Bradford (27). The CAT assay was essentially as 
described (28). but used 0.25M Tris-HCl, pH 7.4, containing 
lOmM DTT and SOmM acetyl-CoA. Quantitation of the thin-layer 
chromatograph was achieved by cutting out the area corresponding 
to each 14c-iabelled spot and counting in a toluene-based 
scintillant containing 4% (w/v) PPO and 0.005% (w/v ) POPOP. 
GUS assay 

GUS activity was measured spectrophbtometrically or fluori- 
metrically in 0.5ml assay buffer containing 50mM sodium phos- 
phate, pH 7.0, lOmM 2-mercaptoethanol, 0.1% (v/v) Triton X-100 
and either ImM h -nitrophenyl-p-D-glucuronide (PNPG; for E. coli 
extracts) or orSmM 4-methyl-umbellif eryl-p-D-glucuronide (MUG; 
for tobacco protoplast extracts). Assays were carried out at 
37 -C and were terminated by addition of 0.4ml 2 . 5M 2-amino-2- 
methyl-l,3-propanediol for E. coli extracts, or 0.5ml 0 . 2M 
Na2C03 for protoplast extracts. ^ -Nitrophenol absorbance was 
measured at 415nm using a Pye Unicam SP1800 Spectrophotometer. 
Fluorescence was measured by excitation at 365nm and emission at 
455nm in a Perkin-Elmer 204 Fluorescence Spectrophotometer. 
In situ localization of GUS activity in S DS-polyacrylamide gels 
samples of protoplast extracts containing equivalent amounts of 
protein were incubated with an equal volume of gel loading 
buffer (29) at room temperature for 15 min, followed by SDS 
polyacrylamide gel electrophoresis (29) in a 12.5% (w/v) gel at 
50V for 16 hours. The gel was rinsed 4 times in 100ml assay 
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buffer (without the glucuronide substrate) for a total of 2 
hours, incubated on ice in assay buffer containing 0 . SmM MUG for 
30 min, and transferred to a glass plate at 37°C for 30 min. 
The gel was then sprayed with 0 . 2M Na2C03 and photographed under 
long-wavelength ultraviolet light using a Wratten 2E filter. 



RESULTS I 

Quantitation of the effect of Jl-Ul and initiation codon context 
on expression of GUS mRNAs in tobacco protoplasts. 

A derivative of the TMV leader, Jl-Ul (Fig. 1), has been shown 
to enhance translation of CAT mRNA in tobacco mesophyll proto- 
plasts, and other eukaryotic and prokarotic systems (7). To 
quantitate the effect of Jl-Ul in protoplasts more precisely, we 
used the GUS reporter gene (30). A Sall-ended fragment con- 
taining the GUS gene from pRAJ2 35 (30) was introduced into the 
Sail site of the pSP64-derived vectors pJIIlOl and pJIIl (7), 
Suiting in pJII120 and pJII119, with or without a 5 '-proximal 
Jl-Ul sequence, respectively. The native Sail GUS fragment had 
19 nucleotides upstream of the AUG start codon . (Fig. 1). The 
context of this AUG codon ( 5 ' -CCCUU AUGU - 3 ' ) was, according to 
Kozak (6), inefficient for eukaryotic translation (hereafter 
referred to as "bad context" GUS). To determine whether the 
effect of Jl-Ul on mRNA expression was influenced by the context 
of the initiation codon, a Sai l fragment of a derivative of the 
GUS gene with an initiation codon context ( 5 ' -CGACCAUGG-3 ' ) 
close to the consensus sequence for optimal eukaryotic transla- 
tion initiation was constructed (in pRAJ275; (8)). This deriva- 
tive (hereafter referred to as "good context" GUS) had only 7 
nucleotides upstream of the AUG (Fig. 1). This Sail fragment 
was introduced into pJIIlOl and pJIIl as for "bad context" GUS, 
resulting in pJII140 and pJII139, with or without a 5 '-proximal 
Jl'-Ul sequence, respectively. 5 '-Capped or uncapped mRNAs were 
synthesized in vitro by SP6 RNA polymerase on B£lII-linearized 
PJII119, PJII120, PJII139, PJII140 templates. Eight micrograms 
of each transcript were electroporated into tobacco mesophyll 
protoplasts and incubated for 20 hours at 2 5-C. Assaying proto- 
plast extracts by GUS-activity gel (Fig. 2) revealed that "good 
context" GUS mRNAs (tracks 6-9) were expressed more efficiently 
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Hind i 1 1 



Sal I 
■ GTCr.AC 



n'GpppOUAUUUUUACAACAAUUACCAACAACAACAAACAACAAACAACAUUACAAUUACUAUUUACAAUUACAAUGG 



TMV-U* St lain 



ro'^GpppGUAUUUUUACAACAAUUACCAACAACAACAACAAACAACAACA, 

AAGCTT-iB^— — ■ '^'TC'^A^ 
ni"'GpppGUUUUUAUUUUUAAUUUUCUUUCAAAUACUUCCAUCAUGA 



ACAUUACAUUUUACAUUCUACAACUACAAUGG 



.7opppOU«.**U*CC..C0«UOCUCCUUCG.UUCCCCCG*ACAUUCUAUUU0ACCAAC*UCGCUUUUU.C^a«AG«=*O*CUCUUUUUGUUC^ 



TMV-SPS scr^m 



BMV RNA 3 



GTCGAC 

IGG TYM\' RNA 



„'GpppGUA*UCAACC*CC**UUCCACCUCUCUUUUGAC»ACUGCUCUUAU.CCCACUUCCGU.«CUUa:.*CCCUCCU*A<^CAAUU«A TV«VJ..A^ 



lCAUGCAUGAAGCAGAAGGCUUCAUUUGGUGACCCCGACGUGA 



m^GpppGmCCAUUUGACCAUUCACCACAUUGGUGUGCGCCUGGGUUGAUGGCCJ^ACCG^^^^ 



GGUGGAUCAAGCAUGG 



RSV-Pr-B Strain 



Sal I 

GUCGACGAGCUUUUCAGGAGCUAAGGAAGCUAAAAUGG 



GUCGACCGGUCAGUCCCUUAUGU 



CAT L4sa.ier 

GUS LejJer "BjsJ" Context 
GUS leader "Good" Context 



GUCGACCAUGG 



Fiq. 1. DNA constructs representing the 5 ' -untranslated viral 
leaders tested for translational enhancement. The sequence of 
the untranslated portion of each viral RNA up to posxtion +4 (6) 
of the first open reading frame is shown. Each initiation codon 
(AUG) is underlined. The region of each leader sequence used in 
the construction of the corresponding oligodeoxyribonucleotide 
is marked above the RNA sequence by the bold line (the uppermost 
beinq Jl'-Ul). Terminal restriction sites for Hindlll and Sail 
were present in each DNA construct. Additional nucleotides 
present between the Sail site and the start codon (underlined) 
of the CAT or GUS reporter gene cassettes are shown below as RNA 
sequences . 

than "bad context" GUS mRNAs (tracks 2-5). In addition, the 
presence of Jl-Ul on both the "good" and "bad context" GUS mRNAs 
enhanced expression considerably, whether the mRNAs were capped 
or not. Accurate fluorimetric quantitation of the kinetics of 
GUS activity (Fig. 3) and hence of GUS mRNA expression (Table 
1), revealed that the levels of expression of uncapped "good" or 
"bad context" GUS mRNAs were below the limit of detection and 
only became detectable when the transcripts were capped and/or 
when JV-ui was present. In all cases, the presence of Si -Ul 
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Pig. 2. ^-glucuronidase ac^ivi^y^^^^^^^^^ 

porated tobacco "^esopbyll protoplasts. bx ^^^^ ^^^^ 

seating equivalent counts of pro^t^^^^^^ ^^^^ ^^^^ 

track. Both bad ana 9" 5 '-cap on expression of the 

quantitate the ef f ect of ^ -Ul or a b p (mock) ; tracks 

Enzyme. Electroporated RNAs ^^"^j^g^ ' "° ^ context" mRNAs . 

2-5: "bad context" mRNAs and tracks 6 9, ,9°°^^^ ^^^^ ^^^^^3 

TlTs'. I'^c^ppfdfoSn^ArfrLL^f aid 9, 5 ' -capped- ^'-Ul-OUS 
mRNA. 

enhanced expression ^ar^eaiy, stimulating the "baa context'' OUS 
„R«A approximately 20-£oia. Stimulation of "good context GOS 
byn'-Ul was even greater, showing an 80-£old increase with 
the capped form of the transcript (Table 1). 

o.her viral 1 -^.^er seguenc . ^ translate ona 1 enhancers xn 

tobacco p rotoplasts. 

TO determine whether the phenomenon of translatxonal enhan- 
cement is associated with all viral RN/. leader sequences or only 
with those leaders which form disome structures. HinaiH and 
sall-lin^ered oligonucleotides were synthesized which ^--P-- 
the 5 --leader sequences o£= THV (SPS straxn), TYMV. AIMV 
„«A4. BMV RKAB. and part of KSV BHA (Fig.l). Due to 
cf synthesis only the 5 '-112 residues of the 380-nucleotxde RSV 
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2000 



40 
Minutes 



Fig. 3. Kinetic analysis of p-glucuronxdase activity ^" ;^^"^!-f 
frlm electroporated tobacco mesoptxyll protoplasts. ^f^P^^^J^ 
^Hor^i^w Of tlie rate of appearance of the reaction product (4- 
^iJCl-umbeA^ferSn^, °4-MU?. J!^"^^^^^Ld^"o; 
valent amounts of protein were added to each assay. convex?" 
"good context" GUS mRNAs; A. 5 "Cappef bad °^ ^ 5°°? .^^^^^qood 
riTC! mRNAs- ^a-Ul-"bad context" GUS mRNA; Jl -Ul 90oa 

cont^xT GUS m 'rNA; 5 ' -capped- JI^-Ul-" bad context" GUS mRNA; 

O, 5 '-capped- iO/ -U1-" good context" GUS mRNA. 

leader (3) were synthesized. This includes the region (residues 
9-53) of the native RSV leader shown to act as the binding site 
for a second 805 ribosome (16). For cloning purposes, these 
oligonucleotides were manipulated in an identical fashion to 
J\i-Ul (7). A family of SP6-transcripts in which each leader 
was located upstream of the "bad context" GUS gene were electro- 
porated into tobacco protoplasts. Only the Jt-Ul and, to a lesser 
extent, the Jl'-SPS leaders proved stimulatory for uncapped tran- 
script! (Table 2). However, when the transcripts were capped, 
stimulation was observed with the leaders of AIMV RNA4. BMV RNA3 
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TABLE 1 ^ 

slational enhancement by Xl-Ul on GUS mRNAs 
electroporated into tobacco protoplasts 



SP6-RNAS 


Initiation 
codon context 


Specific activity 
(nmoles MUG hydrolysed/ 
min/ug protein) 


FOia— 

Stimulation 


Uncapped 








GUS 


bad 


/:o.oi 


1 


JT'-ui-gus 


bad 


0. 18 


>18 


GUS 


good 


/- 0.01 


1 


Jl-Ul-GUS 


good 


0.35 


>35 


5 ' -Capped 








GUS 


bad 


0.03 


1 


Jl'-Ul-GUS 


bad 


0.61 


20 


GUS 


good 


0.04 


1 


tA/-Ul-GUS 


good 


3.2 


80 



and "RSV" as well as A'-ui and -SPS . Only with the TYHV 

leader did the level of GUS activity remain below the limit of 
detection . 

other viral leader sequences as translational enhancers in X. 
laevis oocytes 

we have shown (7) that Xenopus oocytes, microin jected with 
capped or uncapped CAT mRNAs, gave approximately 3- to 4-fold 
more CAT activity when the tR'-Ul leader sequence was present, 
in common with most (or all) animal cells, Xenopus oocytes 
contain high levels of endogenous GUS activity. It was there- 
fore not feasible to assay the different viral leader sequences 
using GUS mRNA as the reporter. Consequently, various pSP64- 
based leader constructs, each containing the CAT gene, were 
transcribed and the uncapped mRNAs microin jected into oocytes, 
in this experiment, the presence of A -Ul gave a 7.5-fold en- 
hancement of CAT activity (Fig. 4). This probably reflects the 
better quality oocytes than were used previously (7). The Jl-SPS 
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TABLE 2 

Translational enhancement by various viral leaders on "bad 
context" GUS mRNAs electroporated into tobacco protoplasts 



%j C Lvl-N *V 0 


Specific activity 
(nmoles MUG Yiydrolysed/ 
min/pg protein) 


Fold- 
stimulation 


Uncapped 






GUS 


4 0.01 


1 


il'^-Ul-GUS 


0.25 


>25 


JV-sps-gus 


0.15 




TYMV— GUS 


4 0.01 




AIMV KNAft-"(jUfc> 


0.01 




BMV KNAJ—vjUo 


0 .01 






< 0.01 




5 '-Capped 






GUS 


0.03 


1 


0 — Ul— GUS 


0.54 


18 


JV-sps-gus 


0.43 


14 


TYMV-GUS 


0.01 




AIMV RNA4-GUS 


0.23 


8 


BMV RNA3-GUS 


0. 23 


8 


"RSV-GUS 


0.23 


8 



sequence gave a similar (6-fold) level of enhancement. The BMV 
RNA3, "RSV"/ and AIMV RNA4 leaders were not stimulatory in this 
system. The TYMV leader sequence appeared to reduce expression 
of CAT mRNA. 

Enhancement by viral leader sequences in prokaryotic cells 

In previous work (7). Jl'-Ul was shown to be stimulatory in 
vitro in an E. coli translation system. The reporter gene seq- 
uences used encoded CAT or neomycin phosphotransferase (NPTII ) . 
In both cases, the transcripts contained the natural prokaryotic 
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5^0 CD C0t-"iOIO <D"«- 



3-Ac «#• 
1 -Ac 
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origin * 

1 2 3 4 5 6 7 8 9 

Fiq. 4. The effect of various viral leader sequences on 
expression of CAT mRNAs microin jected into X. laevis oocytes, 
oocyte extract volumes (equivalent to 0.25 x cell) were a^ssayed 
in each case. Conversion (%) of l^c-chloramphenicol into its 
mono-acetylated form, is^ shown above each track. MicroinDected 
RNAs were: track 1, no i?NA (mock); track 2, CAT mRNA; track 3, 
/?-5l-C^? mRNA; track 4, Jl'-SPS-CAT mRNA; track 5. TYMV-CAT mRNA; 
track 6. AIMV RNA4-CAT mRNA; track 7, BMV RNA3-CAT mRNA; track 
8, "RSV'-CAT mRNA; track 9. 0.1 unit purified CAT enzyme added 
to an equivalent volume of extract, as in track 1. The dried tic 
plate was autoradiographed at room temperature for 4 hours 
before excising and counting the relevant ^-^C-labelled spots. 

Shine-Dalgarno (S-D) ribosome-binding site. The S-D sequence was 
located between the 3 '-end of x/l'-Ul and the start of the open 
reading frames for CAT or NPTII . The S-D region is considered to 
be the most critical feature of a prokaryotic mRNA, signalling 
the attachment of a 30S ribosomal subunit to initiate transla- 
tion at a downstream start codon. Nevertheless, with J L -Ul 
positioned upstream from the natural S-D region of CAT or NPTII, 
there was a significant enhancement of translation in vitro in 
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Operator 

-35 -10 

CTGTTGACAATTAATCATCGAACTAGTTAACTAGTACGAAGCTTGT^^ 

Hindlll Sail BamHI Bglll 

Fiq. 5. Trp-promoter construct used to assay tlie effect of viral 
lelder g-i^ences on expression of various GUS gene transcripts 
in <,itu in E coli. -35, -10, and operator regions of the 
^B^r are lifted above the sequence. ^^^-^Jf^^f ^1%-^;! 
Sre underlined below. The arrow indicates the site of trans 
cription initiation. GUS gene cassettes were introduced at the 
sail site of PJII168 after the various Hindlll/ Sail leader 
cartridges (Fig. 1) liad first been inserted. 

E. coli . Recently (8). we have shown that Jl-Ul also stimulates 
translation of eukaryotic mRNAs, which contain no S-D-like 
sequence, in vitro in an E. coli cell-free system. 

To complement these observations, we examined the effect of 
Jl'-Ul on the in vivo expression of a prokaryotic mRNA which 
lacked a S-D region. A derivative of the tryptophan (trp) promo- 
ter was constructed (Fig. 5) in the plasmid pJII168. Although 
the Hindlll site altered the native sequence of the trp operator 
regi^slightly, this derivative retained the regulation associ- 
ated with the wild-type trp promoter (data not shown). The posi- 
tion of the Hindlll site resulted in addition of only 4 nucleo- 
tides upstream of each leader construct, in contrast to the 12- 
additional nucleotides present in our in vitro SP6 transcripts. 
The Sail -ended "bad context" GUS gene fragment has the native 
( E. ^i ) context of the AUG codon and 13-nucleotides upstream 
from the AUG (Fig. 1). In the native GUS gene, the S-D region 
began just upstream of this 13-nucleotide leader, but this has 
now been replaced by a sequence containing the Sail site. When 
the "bad context" GUS gene was introduced downstream of the trp 
promoter and transformed HBlOl cells were induced and assayed, a 
low but measurable level of GUS activity was detected (Table 3). 
This is in agreement with the previous observation (31) that the 
presence of a complete S-D sequence can be advantageous but is 
not essential for gene expression. Insertion of Jl-Ul between the 
trp promoter and the GUS sequence resulted in a 7-fold increase 
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TABLE 3 



Translational enhancement by various viral 
on GUS mRNAs in E> coli transformed wi 
recombinant trp promoter plasmid 



Leader-GUS Initiation 
construct codon context 


Specific activity 
(nmoles PNPG converted/ 
min/ug protein) 


Fold- 
stimulation 


GUS 


bad 


24 


1 


Jl-Ul-GUS 


bad 


162 


7 


GUS 


good 


3.8 


1 


uR'-Ul-GUS 


good 


31 


8 


Jl'-SPS-GUS 


good 


7.4 


2 


TYMV-GUS 


good 


20 


5 


AIMV RNA4-GUS 


good 


21 


6 


BMV RNA3-GUS 


good 


10 


3 


"RSV"-GUS 


good 


6.4 


2 



in GUS activity, a level in good agreement with that observed 
previously for prokaryotic transcripts which contained a S-D 
region (7,8). This observation contradicts the view that, in all 
cases, E. coli transcripts must have a S-D region for efficient 
expression. The "good context" GUS mRNA has had the initiation 
codon context dramatically altered from that of the native gene, 
and it lacks all the native GUS leader sequence - now replaced 
with the sail site and one C-residue (Fig. 1). The tr£ promoter 
construct containing this "good context" GUS resulted in extre- 
mely low. but detectable, levels of GUS activity. Even in this 
severely altered context, addition of the Jt -Ul leader produced 
an 8-fold stimulation in expression of GUS mRNA. 

AS described above, some residual sequences of the natural 
GUS mRNA leader were present in the "bad context" GUS construct. 
Because these might provide some cryptic S-D function, the "good 
context" GUS construct was chosen as the most sensitive reporter 
to assay for the effect of the other viral RNA leaders on pro- 
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karyotic translation in vivo . The AlMV RNA4 and TYMV leaders 
produced a 6- and 5-fold stimulation, respectively (Table 3). 
The BMV RNA3 and "RSV" leaders provided only slight enhancement, 
3- and 2-fold, respectively. Surprisingly, the JL-SPS leader 
sequence was much less stimulatory than Jl'-Ul in E. coli, caus- 
ing only a 2-fold enhancement. 

DISCUSSION 

work carried out by Kozak (6) showed that the initiation 
codon context of eukaryotic mRNAs has an important role in 
determining the selection of a particular start site and the 
level of mRNA expression. Our results from protoplasts, using 
two variants of GUS mRNA with either a "good" or "bad" initia- 
tion codon context, support these earlier findings (6). 

The endogenous level of GUS activity in tobacco mesophyll 
protoplasts is extremely low. Thus we^were able to quantitate 
accurately the stimulatory effect of Jl -Ul and the other viral 
leaders on expression of GUS mRNA. Whether using "good" or 
"bad context" GUS mRNA, the presence of Si -Ul at the 5 '-end 
resulted in a substantial enhancement of expression (approxi- 
mately 20-fold; Table 1). When capped^ mRNAs were used (Table 1), 
the final level of enhancement by Jl -Ul with "bad context" GUS 
mRNA. was greater than 60-fold and, with "good context" GUS 
mRNA, greater than 320-fold over that seen with the respective 
GUS mRNAs lacking both a cap and an -ul sequence. Similar 
levels of enhancement were observed with Jl -SPS (Table 2 ) . In 
contrast, none of the other viral leader sequences were stimula- 
tory with uncapped GUS mRNAs (Table 2). However, with capped 
GUS mRNAs, the leader sequences of AlMV RNA4, BMV RNA3, and 
"RSV" gave a 8-fold enhancement (Table 2). Only the TYMV leader 
failed to enhance, irrespective of whether the GUS mRNA was 
capped or not. It is of interest to note that the TYMV leader 
has been shown to form disomes (14), suggesting that the ability 
of a leader sequence to form disomes does not correlate with its 
ability to enhance translation. Alternatively, the 12 additional 
5 '-nucleotides added by our SP6 vector construct (7), may have 
selectively destroyed the ability of the TYMV leader to enhance 
translation. However, it appears that even in the absence of 
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these additional 5 ' -nucleotides , the TYMV leader sequence faxls 
to stimulate translation (L. GehrXe , personal communicatxon ) 1^ 
vivo it may be that the TYMV leader is extremely host-dependent 
I77ts enhancing ability. Therefore even protoplasts made from 
tobacco mesophyll cells do not provide the proper machinery for 
the TYMV sequence. Certainly the ability of a viral leader 
sequence to enhance translation is not strictly dependent on xts 
capacity to bind more than one ribosome, as shown by data (above 
and (18)) with the leader of AlMV RNA4 (a monosome former). 

Translational enhancement of CAT mRNA by Jl-Ul in microin^ec- 
ted oocytes was shown previously (7). In this report, enhance- 
ment was also observed with the related Jl -SPS sequence. In 
contrast, leader sequences from AlMV RNA4, BMV RNA3 and "RSV 
failed to enhance translation of CAT mRNA in oocytes. The TYMV 
leader construct reduced CAT mRNA expression by 80%. 

The enhancing effect of Sl'-Vl in E. coli cells may be due to 
some fortuitous interaction with the prokaryotic translation 
machinery. However, as Jl-Ul is devoid of G-residues, it cannot 
provide a sequence similar to that described by Shine and 
Dalgarno; ( 5 ' -AGGAGGU-3 ' ; (D) and shown to be present xn, and 
required for efficient expression of, nearly all E. coli mRNAs 
studied to date. Of the other viral leaders, only those from 
TYMV and AlMV RNA4 displayed any significant enhancement of GUS 
activity in transformed E. coli cells. 

in this survey of viral RNA leader sequences, only one, 
TYMV, consistently failed to enhance expression in the plant 
protoplast system. All the leader sequences were derived from 
positive-sense RNA viruses which must express thexr genetxc 
information immediately and efficiently within the infected 
plant or animal cell to avoid the risk of degradation by host 
RNases. Furthermore, they must compete effectively with the 
endogenous cellular mRNAs. 

sequence comparisons between the leaders tested here show no 
significant homologies other than a high A,U-content, a common 
feature of viral leader sequences. It is tempting to speculate 
that the viral leader sequence may circumvent the need for some 
rate-limiting initiation factor(s), or that it acts as an 
enhancing element for the association of ribosomes or xnxtxatxon 
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factor (s). A precedent for the former possibility exists in tlie 
translational regulation of the prokaryotic IF3 gene (32). The 
sequence immediately surrounding the IF3 start codon allows 30S 
ribosomal subunits to bind, and translation to begin, without a 
requirement for IF3, an initiation factor normally essential for 
the initiation process. The findings of Yokoe and co-workers 
(13) suggest, at least for the two TMV leaders tested, that a 
eukaryotic equivalent of the S-D region exists to interact with 
the 3 '-end of 18S rRNA. The lack of homology between the various 
viral leader sequences may indicate that no one strategy is 
followed by all, but that there may be several ways to achieve 
enhancement . 

The high A,U-content of these leaders might suggest a low 
index of secondary structure, which would present fewer 
obstacles to scanning (4.6) by eukaryotic ribosomes. However, 
the weak but stable secondary structure potential of the BMV 
RNA3 leader (2), and the potential of the complete RSV leader to 

form extensive secondary structures (3). have been used to 
explain how the 5 ' -cap and the initiation codon are juxtaposed 

to facilitate ribosome binding and translation initiation. 

The affinity of these sequences for translation initiation 

factors or other mRNA-binding proteins remains to be tested. 

Clearly substantial additional work is required to elucidate the 

mechanism(s) whereby these viral leader sequences can enhance 

expression of contiguous coding regions in such diverse 

translation systems. 
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We have determined the complete primary structure (13 637 
bp) of the TL-region of Agrobacterium tumefaciens octopine 
plasmid pTiAchS. This sequence comprises two small direct 
repeats which flank the TL-region at each extremity and are 
involved in the transfer and/or integration of this DNA seg- 
ment in plants* TL-DNA specifies eight open-reading frames 
correspoiiding to experimentally identified transcripts in 
crown gall tumor tissue. The eight coding regions are not in- 
terrupted by intervening sequences and are separated from 
each other by AT-rich regions. Potential transcriptional con- 
trol signals upstream of the 5' and 3' ends of all the 
transcribed regions resemble typical eukaryotic signals: 
(i) transcriptional initiation signals ('TATA' or Goldberg- 
Hogness box) are present upstream to the presumed transla- 
tional start codons; (u) *CCAAT* sequences are present 
upstream of the proposed 'TATA' box; (iii) polyadenylation 
signals are present in the 3 '-untranslated regions. Further- 
more, no Shine-Dalgamo sequences are present upstr^m of 
the presumed translational start codons. 
Key words: Agrobacterium /«me/flcie/is/T-DNA/nucleotide 
sequence 



Introduction 

One of the remarkable properties of the Ti plasmids of 
Agrobacterium is their natural capacity to transfer, insert, 
and express a particular DNA segment of the Ti plasmid in 
plant cells (for recent reviews, see Nester and Kosuge, 1981; 
Bevan and Chilton, 1982; Caplan et al., 1983; Zambryski e) 
al„ 1983). Depending on the host plant and on the nature of 
Ti plasmid present in the inciting Agrobacterium strain, the 
transformation event results in crown gall or hairy-root or 
woolly-knot disease (see Kahl and Scheil, 1982). 

The segment of Ti plasmid DNA which becomes stably in- 
serted in the plant genome is caUed T-DNA (Chilton et aL, 
1977; Lemmers etaL, 1980; Thomashow a/. , 1980). On the 
Ti plasmid this DNA segment is bordered by two direct- 
repeat sequences of 25 bp (Zambryski et aL, 1982 1983" 
Yadav et aL, 1982; Holsters et al, 1983). In the case of the 
octopme Ti plasmids, two regions of the Ti plasmid, called 
TL (T-left) and TR (T-right) (Thomashow et aL, 1980) accor- 
dmg to their position on the standard octopine Ti plasmid 
map (De Vos et aL, 1981) can be transferred and inserted in- 
dependently into the plant genome. The TL-DNA has been 



studied more extensively because it encodes essential func- 
tions involved in the neoplastic transformation of plant cells 
(DeBeuckeleere/flr/., 198I; Garfinkel a/. , 1981; Leemanser 
a/., 1982; Wilhnitzer et al„ 1982). The TL-DNA also com- 
prises the functions found in common between octopine-type 
and nopaline-type Ti plasmids' T-regions (Depicker et aL, 
1978; Chihon et ai, 1978; Engler et aL, 1981; Willmitzer et 
al, 1983). 

Recently, the nucleotide sequence of the octopine synthase 
gene (De Greve et al., 1982a), of the gene for 'transcript T 
(Dhaese et aL, 1983). and of the gene for 'transcript 4' 
(Heidekamp et aL, 1983) were determined. Here we present 
the complete nucleotide sequence of the TL-DNA of the 
Agrobacterium tumefaciens plasmid pTiAch5. 

Results and Discussion 

Sequence determination 

To determine the complete sequence of the octopine TL- 
region, different plasmids containing subfragments of the 
TL-DNA were constructed (Table I) from clones pGV0153 
and PGV0201 (De Vos et aL, 1981) containing fragments 
BamHl-S and HindlllA (Figure 1), which overlap the com- 
plete TL-DNA region. Detailed physical maps of these sub. 
clones were established to facilitate the nucleotide sequencing. 
Plasmid DNA was cleaved with a particular restriction en- 
zyme, and the resulting fragments were ^zp end-labeled either 
at their 5' termini with polynucleotide kinase or at their 3' 
termini with the Klenow fragment of DNA polymerase I. 
After strand separation or secondary restriction to separate 
the labeled extremities, the sequence was determined by the 
limited chemical cleavage method of Maxam and Gilbert 
(1980). Both DNA strands were sequenced to avoid mistakes 
that could occur in regions with a distinct secondary structure 
or by incorrect reading and processing of the sequence infor- 
mation. In addition, as methylated bases (Ohmori et aL, 
1978) can interfere with correct reading of the sequence, all 
^coRII sites. located in the TL-region were used for sequenc- 
ing. Furthermore, care was taken that all restriction sites used 
to generate fragments were resequenced by using another 
fragment containing an alternative site. Figure 2 gives an 
overview of the sequence strategy. 
Sequence analysis 

An uninterrupted sequence of 13 637 bp including the whole ' 
TL-DNA of pTiAchS was determined, and is displayed in the 
conventional orientation in Figure 3. The numbering starts at 
the ////zcnil site bordering fragments 14 and 18c, which is 
located 308 bp to the left of the left TL-DNA terminus se- 
quence. 

Termini sequences. The TL-region is flanked at both ex- 
tremities (position 308 and 13 459) by direct repeats of 24 
bases, which are believed to be important for the transfer of 
the TL-DNA segment (Zambryski et aL, 1982; Simpson et 
aL, 1982; Holsters et aL, 1983). 
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Table I. Bacterial strains and plasmids ' 


Antibiotic 


Characteristics 


Origin 


resistance 






Bacterial strains 
K514 

SK383 Sm 


thr leu thi hscJR 

F" Arg~ /iis4, llv" /arMS286 

08OdII/flcBKl Sup" dam4 


Colson eial. (1965) 
S. Kurshner 



Plasmids 




pGV0153 


Ap 


pGvin 


Ap Cml 


pGV714 


ApCml 


pGV715 


Ap Cml 


pGV716 


Ap Cml 


PGV0201 


• Ap 


pGV105 


ApTc 


pGV99 


ApOm 


pGVlOl 


Ap Clm 


pGVlOO 


ApClm 


pGV732 


ApClm 


pGV733 


Ap Clm 


pGV734 


ApClm 



BamHl-S of pTiAchS in pBR322 

//mrflll-lSc of pTiAchS in pBR325 

Hind[n-22c of pTiAchS in pBR325 

////2dlU-36 of pTiAchS in pBR325 

HindiU-BamHl fragment overlapping the fragments 

BamHl-S and HindlU-l in pBR325 

Hind\ll-\ of pTiAch5 in pBR325 

£'coRI-19a of pTiAch5 in pBR325 

BflmHI-17a of pTiAchS in pBR325 

BamH\-\la. of pTiAchS in pBR325 

BamHI-28 of pTiAch5 in pBR325 

Aval deletion of pGVlOl 

Ben deletion of pGV732 

Ben deletion of pGV0201 



De Vos et al. (1981) 
Dhaese et al. (1983) 
This work 
This work 
This work 

De Vos etal. (1981) 
De Greve et al. (1982a) 
De Greve et al. (1982a) 
This work 
This work 
This work 
This work 
This work 



I® 



Tl" dn a 



li— f- 



BamHI — 



Hind HI 



Eco RI 



,30^ 28 



17a 



. 2 



18c 



22e 



38q 36bt 



32g 



r>ln2 
— u 



19a 



Flo 1 Restriction map of the TL-DNA of the octopine Ti plasmid pTiAch 5. Upper portion: the position of the open-reading frames are P»-™^^*>y °P^" 

;J'«?r^b^S rcTorS^^^^ WilLtzer et al. (1982). The polarity of the open-reading frames is indicated as follows: open boxes above the line are 
^SStl f^rft below the L are trans^^ribed from right to left. The extent of the TL-DNA is mdi^ted by an arrow and « 

bfxL^ea^ bars). Lower portion: a restriction map of the TL-DNA region is shown for the restnction enzymes BamHU 

Hindlll, and EcoBl, 



A computer search of the complete TL-region for DNA se- 
quences displaying homologies with these direct repeats 
revealed 10 related DNA sequences. These sequences are 
listed in Table II. Genetic and physical data indicate that 
some of these sequences might also be used in vivo during 
transfer and integration of the TL-DNA. Firstly, the se- 
quence (position 1 1 798) present in the 3 '-untranslated region 
of the octopine synthase gene has been noted by Holsters et 
al. (1983). If this sequence is recognized as a left terminus se- 
quence, the presence of the abbreviated T-DNA found in the 
octopine-positive regenerate plants rGVl and rGV5 (De 
Greve et al , 1982b) can be explained. Alternatively, if this se- 
quence is recognized as a right terminus sequence, instcjad of 
the normal terminus sequence, tumor lines containing a 
shorter TL-DNA which do not synthesize octopine 



(Thomashow et aL , 1980; De Beuckeleer et a/. , 198 1 ; Ooms et 
al., 1982) are formed. The origin of teratomas (unpublished 
results) expressing transcripts 4, 6a, 6b, octopine synthase, 
and possibly transcript 1, can be explained if the sequence 
(position 3750) located in transcript 2 is used as a left ter- 
minus sequence. Similarly, an abnormal plant (unpublished 
data) possibly containing transcript 4 and expressing the oc- 
topine synthase gene, could be explained if the sequence 
(position 7777) is used as a left terminus sequence. In addi- 
tion, either the sequences at position 9078, 10 131, or 10 603 
if used as a right terminus sequence, could explain the short 
TL-DNA observed in a Petunia tumor line P-Ach5 (De 
Beuckeleer et aL, 1981). Whether the other sequences also 
signalled the creation of abbreviated TL-DNAs is difficult to 
answer because in most cases the resulting transferred DNA 
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Nudeotide sequence of the TI^DNA of plasmid pTUchS 



I II- I 7 



J.-L....^ 



c 



7 P C P 

J , LLJ 



Rm7 



"T" 



Hlnfl 
Dd«I 

T«ql 



Dd«i *i , ' ' -f — ' 1 w — in 1^ """i I ,1 -L-pJ 



II I ^ r ^— ^ 



■ M.I L 



llll' I I t Ml 



^Oq... TOO, -.000 ^^^^ 10000 



"2 "3 AHaSaA Hp 

J i! i i' V I 



1 L 



E2E3 



JL — -1— r TT- 7 



R«« I .i - ^ . -i 1 ^ , 

Aw«n"T T ij j ' r-| ^ 1 1 t I 

Hint I 1 1 I I ' t, 1(1 , . 

od«x I I ' in H ^ — H-\ ' 



T "» r-H — I n — r 



H^Ti ' " i 'i"mi " '. hrH— ■ y . 



' ' ' 



T° ORF6l."r'° ORP3 ^='?°° - '*°'>° 



_L 



"1 



^2 Cj E, ' E, E2 



P ' N p 

I I I 



A»«D I — — n- 



1 1 I 



Od.i-' 1 ■ ' ' I I I B-J 1 ' '» I r— r 

h'pI ^ ^ ■ ■ "' 1' ' .II I ' , . . . , , 



«tent of each sequencing experim;,, is indicated by a (^lu'Vow for a 5 "tol' tquS and?dL^^,reirv^'^''T^?= 'r^'^''- "^^P-'^- 
h^vy bar, and the open-reading frames corresponding ,0 plant transcripts by oj^rbo" es VhfpbS of ^ oc^n r^L l!™""- '"^icated by a 

nght by drawing the open boxes above the line and from right ,0 left by drawin^the o,S„ boxes ttow VXZ"' '"^"^ " '^"^ '° 
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AAGCnGCTTGGTCGTTCCGGTACCGTGAACGTCGGCTCGATTGTACCTGCGTTCAAAT« 

Hindi 1 1 « • o e o 200 ^ 300 

TAAAAGTC CCATGTGGATCACTCCGTTCCCCCGT CGCTCACCGTGTTGGGGGGAAGGTGCACATGGCTCAGTTCTCAATGGAAATTATCTGCCTAACCGGCTCAG 

GGCAGGATATATTCAATTGTAAAT | GGCTTCATGTCCGGGAAATCTACATGGATCAGCAATGAGTATGATGGTCAATATGGAGAAAAAGAAAGAGTAATTACCAATTTTTTTTCAATTCAAAAATGTAGATCTCCC^^ 

CTTATTATAAAATGAAAGTACATTTTGATAAAACGACAAATTACGATCCGTCGTATTTATAGGCGAAAGCAATAAACAAATTAf 

CTTCACGATCGATCCCTTGATTTCGCCATTCCCAGATACCCATTTCATCTTCAGATTGGTCTGAGATTATGCGAAAATATACAa 

• • * o o o ^ ^ a o o 

CTCAWTGCTAGGCACTCTGTCAACTCGGCGTCAATTTGTCGGCCACTATACGATAGTTGCGCAAATTTTCAAAGTCCTGGCCT 

ATGGACGAAAAAGGCGAATATTTCGATGCTGAGATTCGACGCAATTAATTCGAGAAAAATCCCGTGATTGATGCTGTTGAGTTACC 

^ o o o < ^ ^ o o e • o • 

*,.-r^*?V».-r^^/.*, I^r ''^^ Ser Ser Asn Leu Gin Asp Arg Arg Glu Leu Lys Leu Val Leu He His Thr Glu Asn kU Tyr Arg 

AGTCACTAATTC6AT ATG TAT GAC GGT CAG CCG ATA TTC AAC ATT ATC GAC AGC TCG AAT CTA CAG GAC CG6 CGT GAA CTT AAA CTC GTC CTA ATT CAC ACA GAG AAT CCT TAT C6C 

• •» o '200 . o o e o D e o - 

\Vi III ^^"^ Gin Arg Ser Trp lie Asn Phe lie He Asn Thr Asp Val Pro lie Asp Pro Ala Lys Asp Glu Val Val Lys Cys Ser Arg Lys 

AGT TCT GCA CAA AGA AGT CTC *TA^GCT TOT CAA CGC TCG TGG ATA AAT TTC ATC ATT AAT ACT GAT GTT CCC ATT GAT CCA GCT AAA GAC GAA GTG GTC AAG TGC TCT CGC AAA 

Xfi ?1* SXS ****** *SP Leu Asn Val Ser Leu Leu Tyr Val Tyr Cys Ser Phe Gin Glu Net Arg Arg Tyr Ala Gin Gin Arg Phe Tyr 

GTT GCG TGC^JGC CCA .GAC CCA ACA GAT ATA CCC TTT GAT ATA CTC AAT GTA -TCT TTG CTG TAC CTT TAT TGT TCG TTT CAA GAA ATG CGA AGG tAT GCA CAA CAG CGA TTC TAT 

5!B Six ^5' nejer Thr Va) Pro Pro Tyr Ala Glii Gly lie Thr Lys Gin Thr Met Arg Leu Trp Gin Lys Lys Val Trp Gin Asn Thr^le? Lys 

GAC GGC GTT TCC GAC GGA GGC GCA GTT ATC TCC ACC GTC CCA CCC TAT GCG GAA GGA ATA ACA AAA CAA ACT ATG AGG TTG TGG CAG AAA AAA GTT TGG CAA AAT ACA AGC AAA 

r,t Tt^ Pi<<° ° o o • o o e 1600 e 

SIl? I5r S^l **P Ser Leu Gin Asn Pro Asn Phe Ser His Met Lys He Gly Gly Asn Ser Phe teu Ala Pro Ser Arg Val 

GAA ACA CAT GAT TTG GAT GCT TAC ATT GCT CTT CTT CCG AAC GCT TCG CTT CAA AAT CCA AAT TTC TCA CAT ATG AAG ATC GGC GGC AAT AGC TTC HA GCG CCA TCC CG6 GTT 

^o^, o » o o' o o 0 1700 ^ o Snal % 

Asp Pro He Cys Val Glu He Val Ala Val Gly Lys Ala Leu Phe Gin Lys Asp Arg Arg Pro Lys Glu Pro. Lys Val Arg Trp Ala Met Ala Leu Ser Ser Leu Trp Lys Arg 

GAT CCT ATC TGT GTT GAA ATA GH GCG GTG GGC AAG GCT CTC TTT CAG AAA GAC AGG CGG CCA AAG GAA CCC AAG GTG AGC, TGG GCT ATG GCT CTC AGT TCC TTG TGG AAG CGC 

Leu Val o o o o o 6 1800 „ o O O O O O 

TTG GTC TAA GGTGCAGAGGTGTTAGCGGGGATGAAGCAAAAGTGTCCGATTGTAACAA6ATATGTTGATCCTACGTAAGGATATTAAAGTATGTATTCATCATTAATATAATCAGTGTATTCCAATATGTACTACGATTTCCAATGT 

o o 1900 OOOOOOOOO 2000 o o 

CTTTATTGTCGCCGTAT6TAATCGGCGTCACAAAATAATCCCCGGTGACTTTCTTTTAATCCAGGAT6AAATAATATGTTATTATAATTTTTGCGATTTGGTCCGTTATAGGAATTGAA6 
TfTCATAArrTTACATGTATTTGAAAAATAAAAATTTATGGTATTCAATTTAAACACGTATACTTGTAAAGAATGATATCTTGAAAW 

AA ATTTATTG ATGCAAGTTTXAATTCAGAAATATTTCAATAACTGATTATATCAGCTGGTACATTGCCGTMATGAAAGAhGAGTGCGAT TTA GCT CAT CGA TCC ATG 

Ser Met Ser Gly His 

GGC TAC TAT GGG GTA CAG AAA TGG GCG ATT ATG GCA TCT CAG: AAA GCG TTT GTG TGG ATT TGA AAC GCA CAG GAA^ATA GTT GGT TTG AAA AAT GGC GAC ATA ATA AGT TAA ATC . 
Ala Val He Pro Tyr Leu Phe Pro Arg Ash His Cys Arg Leu Phe Gly Lys Glu Pro Asn Ser Val Cys Leu Phe Tyr Asn. Ser Gin Phe He Ala Val Tyr Tyr Thr Leu Asp 

ACT TTG TTG CGC CTT CAT CTG AGC TGG CTT TAT GGT AGT GAA GGA TAA TTC TTC GTT CTC CH AAA TTT GAG GTG TGT GTC ATG AAT CCG CTG TGA GAG TGA ACC TTT GCC ATA 
Ser Gin Gin Ala Lys Met Glu Ala Pro Lys lie Thr Thr Phe Ser Leu Glu Glu. Lys Glu Lys Phe Lys Leu His Thr Asp His He Arg Gin Ser Leu Ser Gly Lys Gly Tyr 

e o o o 2600 ' o o o o o e 

CAA ATA CAG CAG CCC ATT GTT TGT TTG GAT TAC CTC TCC TGT TTC CAA TTG TGG AGA TGG ACC ATA AGT TTT GAT AAA CTC TTC GCA TGC CCA GTC TAG GTC GAG GGA GGC CAA 
Leu Tyr Leu Leu Gly Asn Asn Thr Gin He Val Glu Gly Thr Glu Leu Gin Pro Ser Ala Gly Tyr Thr Lys He Phe Glu Glu Cys Ala Trp Asp Leu Asp Leu Ser Ala Leu 

GGG AGT ATC TGC GAA ATT CAT GAfcTGAGGTCTCTTT/STCTAAGACAAACTATATTCCCGCTAT ATATAGATGCCGGGCTGGGACGACCTGTGGCGATGATTTCGGGGGGGG^ 

Pro Thr Asp Ala Phe Asn Met I ' Hindi II 

fcTGAAGAAAATCCATAAAATGCTCGTCACGhTGGCGGATTTTGCGCTGGaGCAATTGCfcACTTTTTGfAAGTCTGCGTTTATCTGTGTGCCAAG 

o o o o ' 3000 OOOOO OOOO 3100 

GGTGAGAGGTGCATCCAAATTAAAAGGTGGGTGCCTTCAGGTCTGTCCTCACACGGCGAGACAATTCAAAAAAGTCATTAATTTCATAATGCAGATTTGACAAATTTGTAAAGGATAGTG^ 

attWtactttatgcctaaataggattgcttgWtttaattatatttccctataatttaggaaaaatgtaatttgcttaagatatataattt^ 
gtttatcactgataataaaattatttatcgaacatgattattgcaaagacttttattggttaaatcataaattaaagtttgttca^ 

caaggattgangtcatcaatctgaaaaattgtaaaaacgaacatggtagaaagt tta- att ggg taa- acc ggc aaa ata tcg gaa tcc aat ggc ttc ttc caa tgc ccc ccc gat tgc taa cag acc 

Asn Pro Leu Gly Ala Phe Tyr Arg Phe Gly He Ala Glu Glu Leu Ala Gly Gly He Ala Leu Leu Arg 

TTG GTC TGA ATC CGC TAA TCC ATC GAT CTC CAT TCC AAC AGG CAA GCG ATC AGG TGT GAG GCA AAC AGG AAT GCT CAA GCC AGG TAG GCC TGC GTT GCT GCT TGC GTC- CAC ATT 
Gin Asp Ser Asp Ala Leu Gly Asp He Glu Met Gly Val Pro Leu Arg Asp Pro Thr Leu Cys Val Pro He Ser Leu Gly Pro Leu Gly Ala Asn Ser Ser Pro Asp Val Asn 

o o e e e 3700« o o o e o 

TCG CAC GTA GAT CTT GAA TGT GTC CAG CAT CGT GCC ATT GTG GAT AAC TGA GGA ATC CTG ACC TAT GGG TCT GGC CAC CAA GGG TGC TGT TGG GAA GAG AAT AGC ATC TAA TCT 
Arg Val Tyr He Lys Phe Thr Asp Leu Met Thr Gly Asn His He Val Ser Ser Asp Gin Gly He Pro Arg Ala Val Leu Pro Ala Thr Pro Phe Leu He Ala Asp Leu Arg 

ATT CAG. Tn GAA GTA GTT GCG ATA GGT GGC TTG AAG TCT^TGG TCT GAA GGA GTG GGG GGC CAG TTC ATA TTC AGC TTT GGA AAT TTG ATG TCC ATC AAT TTG CGC AH GGC AAT* 
Asn Leu Lys Phe Tyr Asn Arg Tyr Thr Ala Gin Leu Arg Pro Arg Phe Ser His Arg Ala Leu Glu Tyr Glu Ala Lys Ser He Gin His Gly Asp He Gin Ala Asn Ala He 

GTT GGC TAC ATC AGG GCT AC6 AAT TCC TTT GAT GAC GTC AGA AAA AGA AAC AGT TTT TAC AAA GTC GTC GAG ATA cfc TTJ TAG AGC GTG TGG AAA TTC ATA GAG TGC AAC TGG 
Asn Ala Val Asp Pro Ser Arg He Gly Lys He Val Asp Ser Phe Ser Val Thr Lys Val Phe Asp Asp Leu Tyr Gin Lys Leu Ala His Pro Phe Glu Tyr Leu Ala Val Pro 

o ^00 o e e o o o'o o © 

GAA GCT GGC CCC TTT ATT CAG TTC GTC AAG GTG GGG AAT GTT AGC TTC AAC AAA AGT TAC GCC TTT GTT TGC TAG CAG GCG AAT CGT TGT HC AGC TGC TAG GGC CAC ATC AGC 
Phe Ser Ala Gly Lys Asn Leu Glu Asp Leu His Pro He Asn Ala Glu Val Phe Thr Val Gly Lys Asn Ala Leu Leu Arg He Thr Thr Glu Ala Ala Leu Ala Val Asp Ala 

o eoe eo« oo 4200 • q 

ATC AAG GTC ATC ATA AAA GTA GGT TGT AGG GAG GCC GAT CCT TAG CCC CTT CAG CGG CAC GGG TGG TAT TCT CTC CGG TGT GCC GGA AAT TAT CCG GTC GAG GAT TAC AAC ATC 
Asp Leu Asp Asp Tyr Phe Tyr Thr Thr Pro Leu G1^ He Arg Leu Gly Lys Leu Pro Val Pro Pro He Arg Glu Pro Thr Gly Ser He He Arg Asp Leu He Val Val Asp 

o o o o e-o o o 4300 0 Q 

GGC TAC GCA CTG CGC TAT GAT TCC GGG AGT GTC CCG GGT AGG GCT AAC CGG TAT TAT CCG ATC TCC CGG ATA TCT ACC AAG CGT CGG TCG AAA TCC TAC TAC GCC ACA CAG GGC 
Ala Val Cys Gin Ala He He Gly Pro Thr Asp Arg Tfir Pro Ser Val Pro He He Arg Asp Gly Pro Tyr Ai:g Gly Leu Thr Pro Arg Phe Gly Val Val Gly Cys Leu Ala 

e e o e Saal e o a e « e o 

TGC GGG TAG GCG AAC AGA TGC ACC GGT ATC GGT GCC TAT GCC GCC TAA CAT CAA TCG GCT TGC TAC CGC AGC AGC CAC ACC ACC GCT TGA GCC CCC TGC TAT CAG ATC TGG ATT 
Ala Pro Leu Arg Val Ser Ala Gly Thr Asp Thr Gly He Gly Gly Leu Met Leu Arg Ser Ala Val Ala Ala Ala Val Gly Gly Ser Ser Gly Gly Pro He Leu Asp Pro Asn 

CCA CGG GTT TCG CAC CGC CCC GGT GGC ATA GTT GTT GCT TGT AAT TCC AAA CGA TAA CTC ATG CAT ATT TCC CGA GGC ACC CGG CAG TGC TCC AGC TGA AAA AAC TCT TTC TGC 
Trp Pro Asn Arg Val Ala Gly Thr Ala Tyr Asn Asn Ser Thr He Gly Phe Ser Leu Glu 'His Met Asn Gly Ser Ala Gly Pro Leu Ala Gly Ala Ser Phe Leu Arg Glu Ala 

o e e o 4600' 0 - o e , e o ' « 

GAC GCC GGA TGG TAT CTT TGG CAA GTG GTT TAT CAG CGC CGG CGT AGC GGC GCT TGT GGG AAA TAC CCC CGT AGC GAT GTT CGC CTT AAA ACA GAG TGG AAT GCC GCA AAG ACC 
Val Arg Ser Pro He Lys Pro Leu His Asn He Leu Ala Pro Thr Ala Ala Ser Thr Pro Phe Val Gly Thr Ala He Asn Ala Lys Phe Cys Leu Pro He Gly Cys Leu Gly 

o e 4700 0 oo« oeo • 

TAC TCC GGC GTT TCC ATG CCG ATC AAT TlT TTT GGC GCT TCG CCG CAA ACC ATC CCA GTC' TGT ACC CAG AAG GCC GTT TAA TGA TTT TGC AGC TTC ACA ACC CCC TAT CAG AGT 
Val Gly Ala Asn Gly His Arg Asp He Lys Lys Ala Ser Arg Arg Leu Gly Asp Trp Asp Thr Ala Leu Leu Ala Asn Leu Ser Lys Ala Ala Glu Cvs Arg Ala He Leu Thr 



Figure 3(i) 
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Nucleotide sequence of the TL-DN A of plasmid pTiAchS 

o 4800 

i]i z II If. %i f^; ?j; III III 1% ii Ee*s SI? jis If. Ill t IT. iTr t ni^, s:? sji ^^"^^'^v<^'^j^c^cd.rc^iz..^u.^ 

CCpTGMCCCAGCTTaEGCCATTATTGCTCGCAATTCJGCA^^ 
G*«TG«CCCCCAT«A«T«TATTA*«««CCC*AAnGTCCCTTTC« 

ACA ?S gi? c^c" i?? SJ? ;xc" SIS ?S.Slf t?? Ill ij;! j^? t?G ii^ ?;? ?if 

Phe Leu G1 a"* " * "-'^ all AAA ATG GTG gat CTG ACA ATG GTC GAT AAG GCG GAT GAA TT6 GAC CGC AGG GTT TCC GAT GCC 

. m TtK G« ?S G^A GCT ill !S ij? il? l]l t'?c' Jc'J ciS AGc' lll (ll T?a" i^l'll r'?." '^^^ ^^^^^^ Phe P?o Glu lie Se? Ala 

« Hindi II ^ «iL ILL ACC GAC TGC AGC GCT GGG TTA GCT TGC AAA AGG CTG GCC GAT GGT CGC TTC CCC GAG ATC TCA OCT 

«f sa ss !s sa s s s s; !s is ja jj; a ;« si iiijK ji a - a ^ - a - - t.. ;.. .i. „. „. 
ss sj a a w s ss as s s si s^f^ !s ;;; a !S k s;.- s- isis&h ss s s ;;; ss si is s; r s 
Ji; s a a s; a B « a » « - - a s - - ^ j; ,.. ^ ^ g. ^ „, g, ... 
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CGCACCTGTCnCATCTGGATAAGATGTTCGlAATTGTTTTTGGCTTTGTCCCGTTGTGGCAGGGCGGC^^ 

TTTAAAATTTCGATGTA!AAT67GGCT^AATTG?iSL^TAAACTA7CGTAAGTGTGCGTGTTAT^ 

TACAAGTTGTTTATATTTCGGGTACCTTTTCCATTATTTTGCGCAACAAGTCACGGATATTCGTGAAAA^ 

ccc*«CK*ccTcn WtWc :^nr? ?S t^' Si? ?SH fix ?S o?i ^X? I- Va ^i? ^I? 1^^ Sa? Sf? t^c ^iS ^ t Sic 

i?? n; iiii X?} a Six'^ ?ix i2J tfx 2i? iE; k si? ii? iic la iii ^ h ?n ii ii? li? s^?,g Sic ^i?- s^a 

S£g ?iS S?^ ir. h ^c?l fcJ Si? ?S Si? Six Si? lEc IE; ^?c Six IX? SI? Si? Ix? iE; j?c SJg Sis x?i ija Sis Six t?c Six 
Si? t?X llE S^ Si? Si? Si? SJg ilE Xic" ^ XJ? Six ?EE k ??? E?X six iJ? Sis Si? ^SE XXE Sic Six SS? ilX Six irc irc ss? xxe 
iE? ?¥? Si? Six ^E SX? ^?l ?S XS is k 5?^ EEX XSE SiS Si? SiX SEX SX? k E?? i?o iJE Si? X?i. S^ ?£5 ^X Si? ??E E?E EES Sic Si? Si? E?? 

Six SXE G?i Sic T?X SX? si? Si? l\l TAG TTTTT»TfiGGGCGOG*TTTCCMCK?S?eTCCTTTTCC^CTTCCATC^ACACTGCT?ATAAAATCT?CTCTTC«TE«^^^ 

fTATTAATGAATAGGACAATTGTfiITcSc?TGTAATTnCaCCATGTTCA?CGT«MCTGATAAATGTTATATTTAATTchCTTC^ 

A°TGATTAGAAA?ATAAGCTCG?ATAGAnATfACCAGGACAScTTACAAaETnTAGAAAWTAGiSi?GG^^ 

•AcXTGCAAA*TAScAA*GT?XSg*C»CACTCAXTCACAtAGAfTAGCCGACTfTATTAGGTGfcGG^^^^^ TTA TGC GGA AAG ATC KA TGA CCC TAA AGC AAT GAT CCG ATA ATT GAT AAG CTT TCC 

o \06Q0 o o 

t V^. t I^J If. JIS S55 J?^ SIS its JIS SIS S:I SIS Z Vsl t Z ilS JIS jlS JIJ J ^S ?j; !^ Si 

t ?n s ?s ss: t siJ s:i t?: s!^ t is i^? Jis sii b g55 255 Jis 
Jis SIS s:[ i^^ s!i li '^i ill 1 1?: £i is: S5 sij t iTr sii ^ it. t 

GCA TTG GC6 TGC AAT ATC TTC ACG AAG GTA TAC ATA GAC CAG CTC HGlCG AGA GTG GAT GTA CTC GTC ATC AAA CTC ACC AAG TCT AAT CGC GGA AG? CTG AAA G7A TAC ACT 
Cys Gin Arg Ala He Asp Glu Arg Leu Tyr Val Tyr Val Leu Glo Gin Arg Ser His He Tyr Glu Asp Asp Phe Glu Gly Leu Arg lie Ala Ser Pro Gin Phe Tyr val 

TTC GGA CAG TAA CGC TCC GAA ATC CGT TCT cic^^TG TTC CAA' GCG ACT CTT CAT CTC GCC GGT GCG CAG GAT AAG C6T CAA ATC TCG AAC CTG CCA ATT AGC TAC CGT CAT 
Glu Ser Leu Leu Ala Gly Phe Asp Thr Arg Ala Gin Glu Leu Arg Ser Lys Met Glu Gly Thr Arg Leu He Leu Thr Leu Asp Arg Val^GlnJrp Asn Ala Val Thr^Met 

CGCAGTGTTGGATGTAC7ACAAATACCT6CCGCTGGTAAGTCT6A6C 

AATTTTCAGCTGCTGAGCCTCGACATGTTGTCGCAAMTTCGCCCTGGACCCGCCCAACGATTTGTCGTCACTGT^ 

GTCGGTTACCCG6CCGCCGTGCTGGACCGGGTTGAATGGTGCCCGTAACTTTCGGTAGAGCGGACGGCCM 

CGCTCGGTGTGTCGTAGATACTAGCCCCTGGGGCACTTTTGAMTTTGAATAAGATTTATGTAATCAGTCTTT 

CGCTCTATCATAGATGTCGCTATAAACCTATlaGCACAATATATTGTTTTCATTTTAATATTGTkATATAAGTA^ 

CATTAGAATGAACCGAAACCGGCGGTAAGGATCTGAGCTkACATGCTCAGGTTTTTTACAACGTGCACAAC 

fee ATT GAG AGC CCT GAC TAT GGC ATT CGC GTT TGA aJc" HC CAG GTT GAG AGA CGA TAG CCC CCT ACC AGT ATG AGA GAG GTC CTC GTT CAG CAC GTC GCT TGC CTC CtS CAC 
Gly Asn Leu Ala Arg Val He Ala Asn Ala Asn Ser Gly Glu Leu Asn Leu Ser Ser Leu Gly Arg Gly Thr His Ser Leu Asp Glu Asn Leu Val Asp Ser Ala GIu um vai 

TAC AGA TTT CAT TTC TGG AAC TTg'cSc GCC GAT CAC TTC AGC TAT CTC AAC CCA GAG CAC CAA AAT GTG CTT CAC GTC CTC ACT AAG GTA GCG ATG GTT CAT GTT TCT TGC AGT 
Val Ser Lys Met Glo Pro Val Gin Val Gly lie Val Glu Ala He Glu Val Trp Leu Val Leu He His Lys Val Asp Glu Ser Leu Tyr Arg His Asn Met Asn Arg ^-ro^^J*- 

TTC GAT ATC j^C^GTA GCC CTC AAA GGT CTC ATA GAA TTC TCT TGC GTT GCT GGC GTG CCC ACC ATA CCA TTT TTT GCA GTA TGC GAA ATC CGT CTC GGA CTC AAG TCC AAG CGC 
Glu He Asp Ala Tyr Gly Glu Phe Thr Glu Tyr Phe Glu Arg Ala Asn Ser Ala His Gly Gly Tyr Trp Lys Lys Cys Tyr Ala Phe Asp Thr Glu^Ser Glu Leu uiy Leu Hia 

ATT AAC AAT CGT AAG GCG TTC CTC GTC TAT TGC GGT AAC ACG CGT GAT GGC TTG TGG GAC AAA CTT CCG GTA GAA TTT TGG GAT CQG CGA TAT CCC TTG CTC GAT GGT ATC TTT 
Asn Val He Thr Leu Arg Glu Glu Asp He Ala Thr Val Arg Thr He Ala Gin Pro Val Phe Lys Arg Tyr Phe Lys Pro Ile^Pro Ser He Gly Gin Glu He Thr Asp Lys 

TGC AGC GAG AAT TCC AGC AGG GTG AGC AAC AGG GTT .GGT ATT TGA AAA AAA TAT CGA CGC GGG ATT TTG ATA CCA CTG AAG CCG ATT TGG AAA GAG AAT CTC GAA GCC CCC CCT 
Ala Ala Leu He Gly Ala Pro His Ala Val Pro Asn Thr Asn Ser Phe Phe He Ser Ala Pro Asn Gin Tyr Trp Gin Leu Arg Asn Pro Phe Leu lie uiu J-ne oiy u y g 

12600 o o o o 

AAC CTC TTC GCT CAA AGC CTG AGT TGA CCC AAC TTC GAA CGT TCT CTT CAC ACT TAG CAT TAG CAC CTG TGC ATT GAC ACG GCG GCA TGC ATA GGG AGA TGT CGT TGC TTC GAT 
Val Glu Glu Ser Leu Ala Gin- Thr Ser Ala Val Glu Phe Thr Arg Lys Val Ser Leu Met Leu Val Gin Ala Asn Val Arg Arg Cys Ala Tyr Pro Ser Thr Thr Ala Glu He 

' GAC GGC GAT AGG TGC AAA AGC TgIg AGT TAA AGT CTG CTT GCA TGC CAG AGA CGT^TGC GCT ACC GGG CAA TGC TAC GAG GAC CGA GCT GCT CAG ATT GAA GTT CGC CAA CTC GCA 
Val Ala He Pro Ala Phe Ala Pro Thr Leu Thr Gin- Lys Cys Ala Leu Ser Thr Ala Ser Gly Pro Leu Ala Val Leu Val Ser Ser Ser Leu Asn Phe Asn Ala Leu Glu Cys 

AAG AAT TCC TTG CTG GCC CAT GGT CGG GAC CGT AAG AAA AA^eAA CGC CGC GCC TGA AAT CGC TGT TTC AAG ATC ATC CTC CAA TTG CGG CTG AAA GTC GCC TCC ATA GTC CGG 
Leu He Gly Gin Gin Gly Met Thr Pro Val Thr Leu Phe He Phe Ala Ala Gly Ser He Ala Thr Glu Leu Asp Asp Glu Leu Gin Pro Gin Phe Asp Gly Gly Tyr Asp Pro 

CCC TAC TAG CTC CAA GGA GCC AAG GGrccV CAC AGA GTT GAA GCT GTT CCT GTT GGA GAT TGG CGC CCA GAT TGA GGA CAC CTG GCC GAG CCT CCG GGC GAG ATC ACC TGC AAG 
Gly Val Leu Glu Leu Ser Gly Leu Ser Arg Val Ser Asn Phe Ser Asn Arg Asn Ser He Pro Ala Trp He Ser Ser Val Gin Gly Leu Arg AjQ^JJa '^'^ 
AGT AAG AGC Ca"St TCC CGC ACC CAA AAT TGC CAC TTT AGC CAT TTGGTAGATTGCAAATATAATGGTTTpGCGATTATCCTTGAGGCCACACaTTAAATAGATCAATGAATGGGCAAGTATT^^ 
Thr Leu Ala Val Asn Gly Ala Gly Leu He Ala Val Lys Ala Met ' . 

TAATCCTTTAACTTTCTGCCTACCATCAATGTGGATGAGTTGTCGGTAAAA AGGATCC CTGAA^ 
ACTGGTAGCTGTTGTGGGCaGTGGTCTCAAGATGGA TCATTAATTTCCACCTTCACCT^^ 

ctattcgggcctaacttttggtgtgatgatgctgact IgccaggatatataccgttgtaattT tgagctcgtgtgaatagctcgctg^^ 

13600 : 
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gacttagccctgatgaactgccgaggggaagccatcttgagctgcggaatgggaatggattcagttg 

Figure 3(iu) 

Fig. 3. Complete nucleotide sequence of the TL-region of pTiAch5. An uninterrupted sequence of 13 637 bp starting at the HindiW site bordering the 
fragments 14 and 18c covers the whole TL-region. The sequence is displayed in the conventional orientation along with the translation in amino acids for the 
coding sequences for which experimental evidence exists. The amino acid sequence is above the DNA sequence when transcription occurs from left to right, 
and below the sequence for the other orientation. The two direct repeats present at both extremities of the TL-DNA are indicated by a closed box. The 
mRNA Start and the polyadenylated sites and signals of transcripts 3, 4 and 7 are indicated by an arrow. The polyadenylation signals of transcripts 3. and 7 
are underlined and their polyadenylation sites are indicated by an asterbc. , 
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would not produce an easily detected altered phenotype in the 
transformed plant cells. 

Size and position of coding sequences. The sequence between 
the 24-bp direct repeats was analyzed for possible translation- 
al open-reading frames. The 18 largest, open-reading frames 
are presented in Table III. To evaluate which of these open- 
reading frames are actually used in vivo, their position was 
compared with the known positions of TL-DNA transcripts 
in octopine crown gall tissues (Willmitzer et al., 1982). Seven 

Table II. DNA sequences homologous to the 24-bp termini sequences 

Left terminus GGCAGGATATATTCAATTGTAAAT 308 bp 

sequence ACCAATTTTTTTTCAATTCAAAAA 407 bp 

CAGAGTTTATATTCAAAAATCAGT 1024 bp 

CCCAACAGATATACCCTTTGATAT 1293 bp 

CCTTTGATATACTCAATGTATCTT 1307 bp 

CATCTAATCTATTCAGTTTGAAGT 3750 bp 

GGGACAATTAGGTCAATTGTAATA 7777 bp 

TATAATGTGGCTATAATTGTAAAA 9078 bp 

TAAATGTrATATTTAATTCTTCTT 10 131 bp 

CCGGGCATAAAAACCGTAGTTTTC 10 603 bp 

CGGGTGATATATTCATTAGAATGA 1 1 798 bp 

Right terminus GGCAGGATATATACCGTTGTAATT 13 459 bp 
sequence 



The TL-region sequence was compared with the left and the right terminus 
sequences using the comparison program written by Schroeder and Blattner 
(1982). All sequences sharing >50<7o homology with the terminus 
sequences were maintained. 



Nucleotide sequence of the TL-DNA of plasmicl pTIAchS 

of the open-reading frames did correspond with known iran^ 
scripts. We tested whether or riot some of the other open- 
reading frames might correspond to TL-DNA regions, whose 
transcripts might have gone undetected, by comparing their 
position with empty regions in the transcription map. This 
was the case only for open-reading frame m (Table III). 
Subsequently, a careful experimental analysis confirmed that 
this open-reading frame corresponded to an actual transcript 
(6b) (Willmitzer et aL, 1983; Joos et.al., 1983). The trans- 
lation of these eight open -reading frames in amino acids is 
presented in Figure 3 and their codon usage is listed in Table 
IV, It was also tested whether open -reading frame p which is 
derived from the opposite strand of transcript 3 and which 
might code for a protein of 142 amino acids could correspond 
to an actual transcript. MI3 mp2 phage DNA, containing the 
small £'coRI fragments and ^2 (Figure 1) located in the 
octopine synthase gene, were separately applied on nitrocellu- 
lose and hybridized with labeled mRNA isolated from tobac- 
co crown gall tissues. Only the phage DNA spot containing 
the strand corresponding to transcript 3 (octopine synthase) 
hybridized with mRNA (data not shown). 

We have applied the RNY algorithm described by 
Shepherd (1981) on the whole sequence of the TL-DNA (data 
not shown). Eight frames were detected and these correspond 
to the eight known transcribed regions. 

The size and map position of several proteins, expressed by 
the T-DNA in transformed plant cells, or by the T-region in 
bacterial cell-free systems, have been recently determined 
(summarized in Table III). By hybridization selection and 
translation of T-DNA-encoded mRNA from octopine 
tumors, three proteins of 39, 27 and 14 kd were detected 
(Schroder and SchrOder, 1982). The largest has been shown to 



Table III. Co-ordinates of open-reading frames of the TL-region DNA 



Open 
region 



Nucleotide 



First 



Last 



First ATG 
in frame 



E AA 



Mol. wi. 



Calculated 
(d) 



Observed 



Correspondence 



a 


1054 


1740 


1060 


226 


25 635 . 




Transcript 5 


b 


1569 


1135 


1512 


125 


14 310 




c 


2726 


2307 


2687 


126 


14 219 


14 


Transcript 7 


d 


4124 


4474 


4232 


80 


8252 






e 


4881 


3460 


4863 


467 


49 655 


49 


Transcript 2 


f 


5155 


7476 


5209 


755 


83 815 


74 


Transcript 1 


8 


6039 


5659 


5979 


106 


12 101 






h 


6888 


6622 


. 6876 


84. ■ 


10 014 








7025 


7513 


7178 


111 


12 750 






J 


8105 


8893 


8171 


240 


26 873 


27 


Transcript 4 


k 


8542 


8294 


8527 


77 


. 8858 




1 


9344 


9970 


9395 


1191 


. 21 335 




Transcript 6a 


m 


11 160 


10 453 


1 1 076 


207 


23 320 




Transcript 6b 


n 


11 142 


11 405 


, 11 178 


75 


8160 






o 


11581 


11 092 


11 353 


86 


9375 






P 


■ 12 020 


12 460 


• 12 032 


142 


16 455 






q 


13 081 


1 1 954 


13 030 


358 


38 665 


39 


Transcript 3 


r 


13 203 


12 901 


13 203 


100 


11 331 







,' V ""K-J" »■ '"-^ "Pc.-reaamg irames larger than 75 ammo acids. The co-ordinales are those of ihe first nucleotide following the Drecedino sIod th< 
nas Deen calculated and is compared, when possible, with experimental data (Schroder and Schroder, 1982; Schroder el aL, 1981, 1983). 
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Table IV. Codon usage 



Tfanscripts Transcripts Transcripts Transcripts 





5 


7 


2 


1 


4 


6a 


6b 


3 




5 


7 


2 


I 


4 


6a 


6b 


3 


5 


7 


2 


1 


4 


6a 


6b 


'3 


5 


7 


2 


I 


4 


6a 


6b 


3 


Phc 


UUU 3 


5 


11 


21 


2 


4 


5 


8 


Ser 


UCU 4 


1 


3 


13 


1 


3 


2 


6 


T>T UAU 6 


6 


8 


12 


7 


7 


4 


5 


Cys UGU 2 


0 


3 


8 


I 


3 


1 


0 




UUC5 


3 


■ 6 


22 


7 


3 


4 


8 




UCC 4 


2 


6 


1! 


1 . 


1 


5 


5 


UAC 2 


y 


4 


11 


1 


3 


6 


4 


. ' UGC 3 


3 


4 


13 


3 


0 


4 - 


4 


Leu 


UUA 1 


2 


9 


4 


* 


3 


2 


1 




UCA 1 


3 


5 


9 


1 


5 




5 


Stop UAA 1 


1 


1 


0 


0 


0 


1 


0 


Stop UGA 0 


0 


0 


0 


0 


0 


0 


1 




UUG5 


3 


7 


13 


6 


4 


4 


7 




UCG 4 


* 


3 


6 


2 


2 


0 


4 


UAG 0 


0 


d 


1 


1 


1 


0 


0 


Trp UGG 5 


1 


2 


14 


3 


2 


2 


4 




CUU4 


0 


7 


1) 


9 


6 


5 


11 


Pro 


ecu 1 


0 


7 


13 


3 


1 


1 


3 


His CAU 2 


3 


3 


13 


8 


1 


I 


2 


Arg ecu 1 


0 


3 


6 




0 ■ 


2 


2 




cue 5 


3 


6 


16 


2 


. 3 


1 


9 




cec 4 


2 


8 


3 


4 


1 


' 0 


3 


CAC 1 


1 


6 


4 


2 


1 


1 


3 


CGC 4 


1 


7 


6 


- 


I 


4 


3 




CUA2 


* 


8 


5 


3 


3 


* 


4 




CCA 7 


4 




10 


3 


3 


3 


7 


Gin CAA 7 


"5 


5 


13 


7 


8 


4 


•6 


CGA 2 • 


0 


6 


7 


3 


1 


4 


0 




cue 1 


5 


IS 


22 


6 


3 


S 


4 




CCG 2 


0 


8 


12 


1 


1 


4 


4 


CAG 5 


1 


3 


9 


9 


3 


6 


8 


CGG 3 . 


1 


5 


8 


3 


7 


2 


3 


lie 


AUU5 


2 


14 


18 


8 


2 


4 


8 


Thr 


ACU 2 


4 


4 


6 


1 


2 


3 


6 


Asn AAU9 


4 


8 


13 


4 


4 


7 


8 


Ser AGU 3 


1 


0 


8 


1 


0 


1 


2 




AUC5 


2 


9 


18 


7 


5 


6 


9 




Ace 1 


• 1 


8 


8 


4 


1 


2 


5 


AAC 2 


2 


12 


12 


5 


3 


8 


12 


AGC 3 


2 


12 


6 


3 


6 


2 


7 




AUA 7 


2 


11 


•9 


1 


1 


2 


5 




ACA 5 


3 


9 


14 


3 


2 


2 


2 


Lys AAA 8 


4 


12 


17 


4 


6 


0 


6 


Arg AGA 1 


1 


7 


5 


1 


I 


3 


3 


Met 


AUG 5 


3 ' 


5 


17 


8 


5 


7 


5 




ACG 0 


0. 


4 


3 


5 


1 


3 


7 


AAC 6 


4 


4 


17 


6 


3 


1 


4 


AGG 4 


0 


1 


13 


2 


2 


1 


6 . 


Val 


GUU 9 


1 


9 


14 


3 


4 


3 


8 


Aia 


GCU 9 


1 


13 


19 


6 


7 


3 


10 


Asp GAU 7 


2 


16 


23 


6 


9 


9 


.6 


Gly GGU 1 


2 


9 


19 


4 ■ 


7 


4 


6 




GUC4 


1 


2 


13 


2 


2 


2 


6 




GCC 1 


3 


19 


14 


7 


4 


2 


5 


GAC 8 


3 


12 


23 


4 


4 


5 


6 


GGC 5 


2 


13 


14 


2 


5 


3 


8 




GUA 1 


2 . 


11 


3 


0 


1 


3 


3 




GCA 3 


3 


. 13 


18 


6 


1 


5 


14 


Glu GAA 7 


4 


* 13 


23 


6 


7 


9 


10 


GGA2 


2 


13 


16 


9 


3 


7 


6 




GUG 3 


0 


8 


18 


3 


3 


0 


12 




GCG 3 




8 


10 


4 


1 


5 


'10 


GAG 1 


5 


3 


15 


9 


5 


8 


15 


GGGO 


1 


.6 


14 


3 


1 


3 


5 



There is no general bias in the codon usage of these eight coding sequences taken together, although individually, large deviations do occur. We should note 
that the transcripts 1, 2, 3, 6a and 6b have a high preference for G as first base (>33.9%) and transcripts 4, 6a, 6b and 7 have a high percentage of A in the 
second position (>33.2'7o). No such deviations are noted in the third position. 



be octopine synthase (transcript 3). The smallest one was 
selected with Hindill fragment 18 (Figure 1) and corresponds 
to the translated part of the gene transcript 7. The nucleotide 
sequences of both transcript 3 and 7 have been described (De 
Greve et al., 1982a; Dhaese et al., 1983). The third protein 
(mol. wt. = 27 kd) was observed after hybridization selec- 
tion both with the partially overlapping fragments Ba/wHI-8 
and HindVllA (Schroder and Schrdder, 1982) (Figure 1). The 
authors suggested that at least part of the coding region is 
common to both fragments, but we do not find any open- 
reading frame in this part of the TL-region corresponding to 
a protein of this size. However, from Table III it appears that 
the polypeptides encoded by transcript 4 (located in Hindill 
fragment 1; Figure 1) and transcript 5 Oocated in BamHl frag- 
ment 8; Figure 1) have nearly the same mol. wts. (26 873 and 
25 635 daltons, respectively). The experimental results ob- 
tained by Schroder and Schrikier (1982) can be explained if 
we assume that the observed 27-kd protein bands are in fact 
different and are encoded by transcripts 4 and 5, respectively. 

The TL-regipn of octopine Ti plasmids expresses four pro- 
teins (mol. wt. = 74, 49, 28 and 27 kd) in Escherichia coli 
mini-cells (Schroder et al., 1983). A comparison of the 
regions ^pressed in bacteria and the TL-regiori sequence in- 
dicates that three protein-coding regions in the bacteria cor- 
respond to three open-reading frames which are transcribed in 
plants (Table III). The mol. wts. of the polypeptides encoded 
by transcripts 2 (49 kd) and 4 (27 kd) as calculated from the 
sequence, are in good agreement with the mol. wts. ex- 
perimentally observed by Schroder et al. (1983) in a bacterial 
background. However, there is a discrepancy between the 
calculated (84 kd) and the observed (74 kd) mol. wts. for the 
protein encoded by transcript 1. Schroder et al. (1983) show- 
ed that the right-end of the BamHl-S fragment (Figure 1) in 
pGV0153 encoded a 66-kd protein, which represents a 
shortened form of the 74-kd protein. The mol. wt. of this 
shortened protein calculated from the DNA sequence is 
69 kd. Furthermore, deletion of fragment //pal- 14, which is 
an internal fragment of EcoRl fragment 7 (Figure 1) that 
covers this region, produced a protein of mol. wt. = 53 kd 
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(SchrGder et al., 1983). From the DNA sequence we can 
predict that the first 483 amino acids of transcript 1 will be 
fused to the last 16 amino acids of transcript 4 in this deletion 
mutant. The mol. wt. of this fusion protein is 55 kd, in good 
agreement with the mol. wt. (53 kd) observed by Schroder et 
al. (1983). It is likely, therefore, that the 74-kd protein is in- 
deed encoded by the transcript 1 gene and that the difference 
in the observed and calculated mol. wts. can be explained by 
(i) an underestimation of the observed mol. wt. in SDS-poly- 
acrylamide gels, or (ii) proteolytic degradation of this 
polypeptide in bacteria yielding a shorter protein. 

Finally, Schroder et al. (1983) observed a 28-kd polypep- 
tide in E. coli mini-cells. They located the gene encoding this 
polypeptide to the left of transcript 4. We do not find an open- 
reading frame in this region large enough to accommodate 
this 28-kd protein. Furthermore, no mRNA isolated from 
crown gall tumors has been observed to hybridize to this 
region. 

Transcription initiation and polyadenylatioh signals. Com- 
parisons of a multitude of eukaryotic protein-encoding genes 
have revealed a limited number of consensus sequences po- 
tentially involved in RNA polymerase ll-mediated transcrip- 
tion. The *TATA' box or Goldberg-Hogness box (Proudfoot, 
1979) is located 25-30 bp upstream from the start site of 
transcription and is involved in vivo in the accurate position- 
ing of the mRNA start site (McKnight and Kingsbury, 1982). 
The consensus sequence GG(C/T)CAATCT of 'CCAAT' 
box (Benoist et al., 1980), which appears 40-50 nucleotides 
upstream of the TATA box, is involved in the regiilation of 
transcription of some eukaryotic genes. By comparing plant 
genes, a possible regulatory sequence, called AGGA box, was 
identified by Messing et al. (1983). As the transcription of 
TL-DNA genes is a-amanitin sensitive (Willmitzer et al., 
1981) and potential control signals in the 5' regions of the 
T-DNA genes (De Greve et al., 1982a; Depicker et al., 1982; 
Dhaese et al., 1983; Heidekamp et al., 1983), of which the 
transcription inidation site was accurately determined, have 
been found resembling those typically used by eukaryotes, we 
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Table V. Eukaryotic signals present in 5' and 3' sequences of the different transcripts 



Position 



'CCAAT' box 



Position 



'TATA^ box 



Position 



Consensus 
sequence 

Transcript 5 



Transcript 7 
Transcript 2 



Transcript 1 

. Transcript 4 

Transcript 6a 
Transcript 6b 

Transcript 3 



Poly(A)^ 



909 
. 935 
979 
1001 

2800 

4932 
4943 



5092 
5118 
5144 

8072 
8080 
8094 

9294 

11 169 
11 204 

13 114 



GG^fCAATCT 

GGCgAATaT 
acgCAATta 
taCCAATaa 
GGCCAtTta 

GtTCAAgCT 

GcgCAAgCT 
caCCAATaa 



GcCCAAatT 

tGTCAAcga 
tcTCAActT 

ctTCAATaa 
aaTgAATtT 

aGaCAATaT 

GcgaAATtT 

caCCAATga 
taTCAATCT 

aCrCAATac 



983 
1012 
1029 

2735 * 
4909 



5175 

8098 
8131 

9326 

11 137 

13 088 



TATA^ 



aATAAtA 
TATAAgA 
TtTATAT 

TATATAT 
TATATtT 



TATtTAT 

aATATAA 
TATAAAA 

TATtAAT 
TATAAAA 

TATtTAA 



1912 
1948 



2188 

3281 
3297 
3312 
3364 

7710 
7727 

9101 
9169 

10 030 
10 085 

10 260. 

JO 355 

10 434 

11 778 
11 810 
11 814 



AATAAA 

AATAAT 
AAtAAT 



AATAAA 

AATAAT 
AATAAT 

AATAAA 
AATAAT 

AATAAT 
AATAAT 

AATAAA 
AATAAA 

TATAAA 
AATGAA 

AATAAT 
AATAAA 
AATAAA 

AATAAT 
AATATA . 
AATGAA 



searched for homologies with these putative regulatory se- 
quences in the 5 ' -untranslated region of the TL-DNA genes. 
In the 5 '-untranslated region of transcript 5, three sequences 
. AATAATA, TATAAGA, and TTTATAT (position 983, 
1012 iand 1029), sharing homology with the TATA sequence, 
are located respectively 77, 48 and 31 bp upstream from the 
translation start codon and are preceded by four 'CCAAT'- 
Hke sequences (GGCGAATAT at position 909, ACGCAAT- 
' TA at 935, TACCAATAA at 979, GGCCATTTA at 1001). 
Transcript 2 has a TATATTT sequence (position 3460) and 
two possible CCAAT sequences (GCGCAAGCT at position 
4932 and CACCAATAA at 4943). A TATTTAT sequence 
(position 5175) is located 34 bp upstream from the translation 
start codon of the gene encoding transcript 1. This TATA 
box is preceded by three possible CCAAT boxes (positions 
5692, 5118, and 5114). The 5 '-untranslated region of the gene 
encoding transcript 6a contains a TATTAAT sequence (posi- 
tion 9326) located 69 bp upstream from the ATG translation 
codon and a CCAAT sequence (position 9294) located 32 bp 
upstream from the presumed TATA box. The gene encoding 
transcript 6b has a TATAAAA sequence (position 11 137) 
61 bp upstream from the translation start codon. Two 
CCAAT sequences (position 11 169 and 11 204) are located 
upstream of the TATA box at a distance of 32 bp and 67 bp. 
A summary of the eukaryotic signals found in the 
5 '-untranslated regions is listed in Table V. However, we did 
not find sequences in the 5 '-untranslated regions of the TL- 



DNA sharing significant, homology with the AGGA box 
(Messing e/ flr/., 1983). 

Sequences essential for the in vivo expression of eukaryotic 
genes, however, are located, in most cases, 200-300 bp 
upstream of the transcription initiation site. From genetic 
studies, there is evidence that sequences upstream of the 
TATA and CCAAT boxes are also involved in the in vivo ex- 
pression of the octopine synthase gene (Koncz et aL, 1983) in 
plant cells. We did not find nucleotide sequence homology 
between this 5' upstream region of the octopine synthase 
gene and the 5' upstream regions of the other TL-DNA 
genes. 

Most eukaryotic protein-encoding transcripts are poly- 
adenylated. The only primary sequence common to the 
3 '-untranslated region of almost all eukaryotic genes is the 
hexanucleotide AATAAA (Proudfoot and Brownlee, 1976; 
Benoist et aL, 1980), or a one-base variation of this sequence 
(Nevins, 1983). This sequence functions in the recognition of 
the poly(A) addition site (Fitzgerald and Shenk, 1981; 
Montell et aL, 1983), The poly(A) addition sites of the octo- 
pine synthase (De Greve et aL, 1982a), the nopaline synthase 
(Depicker et a/., 1982), the octopine synthase present in the 
, regenerated plant rGVl and transcript 7 (Dhaese et aL, 1983) 
are indeed closely preceded by this hexanucleotide signal. In 
the case of the wild-type octopine synthase and the rGVl oc- 
topine synthase multiple polyadenylation sites have been 
observed. This was also found to occur in animal genes 
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(Setzer et aL, 1980; Early et aL, 1980). We looked for the 
presence of AATAAA or related sequences in the 
3 '-untranslated regions of the TL-DNA genes encoding 
transcripts 5, 2, 1, 6a and 6b. For each gene at least two 
potential canonical sequences are found. Transcripts 5 and 1 
each contain two pdlyadenylation signals AATAAT (position 
1912 and 1948 for transcript 5 and 7710 and 7727 for 
transcript 1). In transcript 5, these are located at a distance of 
172 bp and 208 bp downstream of the stop codon, and those 
of transcript 1 at 234 bp and 251 bp downstream from the 
stop codon. The 3 ' -untranslated region of transcript 2 con- 
tains four possible polyadenylation signals: AATAAT (posi- 
tion 3281), AATAAT (3297), AATAAA (3312) and 
AATAAT (3364), respectively 96, 148, 163 and 180 bp, past 
the translational stop. In the 3 ' region of transcript 6b three 
polyadenylation signals AATAAT (10 260), AATAAA 
(10 355), and AATAAA (10 434) are found respectively 193, 
98 and 19 bp downstream from the stop codon. Transcript 6a 
has two sequences: TATAAA (10 030) and AATGAA 
(10 085) in its 3 ' end which are located at a distance of 60 bp 
and 115 bp downstream from the stop codon. All these data 
are summarized in Table V. . 

Translation initiation codons. In eukaryotes, the first AUG 
of the majority of mRNAs is used as an initiation codon. In 
the scanning model, two bases (A or G at position — 3, G at 
position + 4) flanking the initiation codon (A/GXXAUGG) 
facilitate the recognition of the functional AUG codon 
(Kozak, 1981). 

Since none of the amino acid sequences of the proteins en- 
coded by the TL-DNA in plant cells have been determined, 
no experimental data exist concerning the sites used to initiate 
translation of the plant transcripts. As can be seen in Figure 
2, the first AUG following the *TATA* box is in phase with 
all the open-reading frames and most likely initiates transla- 
tion in plants. The first AUG of these plant transcripts are 
preceded by a very G-poor stretch of DNA and do hot con-' 
tain a Shine-Dalgamo sequence (Shine and Dalgamo, 1974; 
Stormo et aL, 1982). This lack of Gs upstream of eukaryotic 
initiation codons has already been observed (Kozak, 1981; 
Sargan et aL, 1982). 



In the open-reading frames of the genes encoding transcript 
5, 7, 2, 4 and 3 the second AUG is located at a distance of 
300, 231, 354 and 252 bp, respectively, of the first AUG. In 
the case of open-reading frames 2 and 4, which are translated 
in E, call mini-cells (Schroder et al. , 1983) these data support 
the hypothesis that the same translational start is used in 
bacteria as well as in plant cells. Two AUG codons (positions 
11 019 and 11 076) can be used as initiation codon for 
transcript 6b. Both AUG codons are flanked by a G (position 
-3) and an A (position +4). Because the initiation codons 
are equivalent, there is no reason to believe that the first AUG 
codon is not used as the translational start: 

In transcript 6a three AUG codons (position 9395, 9404 
and 9410) can be used as initiation codon. The first and the 
third AUG codons are flanked by two bases which facilitate 
the recognition of functional AUG codons (Kozak, 1981). 
Comparison of the TL-DNA sequence of transcript 6a with, 
the corresponding nopaline T-DNA sequence (unpublished 
data) indicate that in the homologous pTiC58 sequence only 
the third AUG is conserved. This observation suggests that 
translation of the octopine transcript 6a starts at the third 
AUG. However, we cannot exclude that the transcripts 6a en- 
coded by the octopine TL-DNA and the nopaline T-DNA, 
respecdvely, have different translational starts. 

Transcript 1 also contains three AUG condons in the 
beginning of the frame (positions 5209, 5260 and 5275). 
Although we have no data to support that the first AUG is 
not used as the initiation signal in the plant cells, the possibili- 
ty exists that the third AUG, which is preceded by a 
GGTGGA sequence (position 5262) might be preferably used 
in a bacterial background. The difference in mol. wt. will be 
2.3 kd, when calculated from the sequence, and the cor- 
respondence with the observed mol. wts. of the shorter 
polypeptides (53 and 66 kd) (Schroder et ai, 1983) and the 
computed mol. wts. (52.7 and 66.7 kd) are even better. 

To solve the question of whether the same translation start 
codon is used in plant cells and in bacteria, amino acid se- 
quences of both will be needed. 

Intervening sequences, A characteristic but not an absolute 
criterion of eukaryotic genes is the presence of intervening se- 
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Fig. 4. GC profile of the TL-DNA, A window of 100 bp was slid along the sequence by increments of 50 bp, and its G + C pyercentage was calculated. The 
position and size of each known coding region and its orientation is indicated by arrows. The two i>arts of the figure are contiguous, but the right part of 
transcript 1 is rq>eated in the lower figure in order to emphasize the periodicity of the GC content. 
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quences. To date, several plant nuclegr genes have been 
shown to contain intervening sequences (Sun et al , 1981* 
Fisher and Goldberg, 1982; Hyldig-Nielsen et ai, 1982; Shah 
et aL, 1982), while several others lack intervening sequences 
(Geraghty er^7/., 1981; Fisher and Goldberg, 1982; Pedersen 
et aL, 1982). The existence of inirons in the coding regions of 
the different TL-DNA transcripts is very unlikely. Firstly, the 
open-reading frames correlate well with the sizes of the cyto- 
plasmic polyadenylated transcripts 1, 2, 3, 4, 5, 6a, 6b and 7, 
determined by Northern analysis (Willmitzer et aL, 1982,' 
1983). Secondly, as discussed above, the sizes of the proteins 
observed experimentally in vitro (Schroder and Schroder, 
1982), and in E. coli (Schroder et aL, 1983) correspond nicely 
to those calculated from the sequence presented in Figure 3. 
Furthermore, we have looked without success for sequences 
fitting with the donor and acceptor consensus sequences pro- 
posed by Mount (1982) normally found at the intron-exon 
junctions. 

G4- C content, A striking feature of the TL-DNA sequence 
(Figure 4) is observed when a graphical display of a G + C 
content profile is plotted. Each functional coding sequence is 
separated from its neighbours by an AT-rich interval. The 
3 '-untranslated region of each transcript is very AT-rich, a 
feature also observed in the 3 ' -untranslated region of other 
plant genes, ranging from 24% G + C in the soybean leg- 
hemoglobin gene (Hyldig-Nielsen et aL, 1982) to 3707o G-f C 
in the ribulose-1.5-biphosphate carboxylase gene (Bedbrook 
et aL, 1980). The dip in the G-i-C profile is less marked bet- 
ween transcripts 1 and 2, possibly because in this case both 5 ' 
ends are very close to one another. Furthermore, these large 
variations of G -I- C content can be visualized under the elec- 
tron microscope by partial denaturation of the Ti plasmid 
and are limited to the TL -region and the homologous region 
of the nopaline T-DNA (G. Engler, personal communica- 
tion). 

Conclusions 

From the determination and the analysis of the primary struc- 
ture of the TL-DNA sequence, the following conclusions can 
be drawn: (i) all the TL-DNA genes contain the signals to be 
transcribed and translated in plant cells; (ii) the absence of in- 
tervening sequences and the compact organization of the 
genes on the TL-DNA suggest that a maximum amount of 
genetic information is concentrated in a minimum amount of 
DNA. 

Materials and methods 

Enzymes 

DNA polymerase I (large fragment, according to Klenow) and T4 polynucleo- 
tide kinase were from Boehringer Pharma (Mannheim, FRG). 

Restriction enzymes were from Boehringer Pharma (Mannheim, FRG) or ' 
New England Biolabs (Beverly, MA, USA), and were used according to the 
suppliers' instructions. 

Bacterial strains and plasmids 

BactCTial strains and plasmids are listed in Table I. 

Plasmid preparation 

Agarose gel electrophoresis, conditions for DNA ligation, and transformation 
of competent E, coli cells were as described by Depicker et al. (1980). 

Plasmids were prepared from E. coli K514 or by CsCl-EtBr equilibrium 
density gradient centrifugation in cleared SDS lysates (Betlach et aL, 1976). 
The copy number of the pBR derivatives was increased by adding chloram- 
phenicol (170 /ig/ml) or spectinomydn (300 /ig/ml) to an exponentially grow- 
ing culture and incubating for a further 15 h. 

DNA sequence determination 

DNA fragments to be sequenced were labeled at their 5 ' ends with [y-'^^P]- 
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ATP ( > 2000 Ci/mmoi, Amersham) and T4 poJynucIeoiide kinase (Boeh- 
rmger, Mannheim, FRG) after treatment with bacterial alkaline phosphatase 
(Boehringer, Mannheim, FRG); DNA fragments wre labeled at their 3 ' ends 
usmg dther [^zpjcordycepin (NEN) and terminal nucleotidvl transferase, or 
. [a-^PJdATP and Klenow polymerase (Boehringer, Mannheim, FRG). The 
labeled fragments, after secondar>' restriction, were extracted from low- 
gelling temperature agarose as described by Wieslander (1979), or, after 
strand separation, were extracted from acryiamide as described by Maxam 
and Gilbert (1980). 

The five chemical modification and cleavage reactions G, A-i-G, C + T, C 
and A + C were performed as described by Maxam and Gilbert (1980) The 
cleavage products were separated on and i5% gradient acryiamide gels 
(0.3 mm x 90 cm) containing 8,3 M urea (Sanger and Coulson, 1978) The 
gels were auioradiographed at -70°C using intensifying screens. 
Computer analysis 

Routine analysis (restriaion sites, overlaps) of the sequencing data was per- 
formed on a Cromemco microcomputer using the mapping and comparison 
programs written by Schroeder and Blattner (1982) for the CP/M operating 
system. We developed a program along the lines of the RNY algorithm • 
described by Shepherd (1981) and the programs used to calculate the mol wt' 
of the protems (Table II), the codon usage (Table III), and the GC profile of 
the sequence (Figure 4). The limited computing ability of our microcomputer 
did not allow us to perform extensive searches of similarities using the Sellers 
(1979), or Needleman and Wunsch (1970) algorithms. Imperfect repeats might 
therefore have escaped. A machine-readable copv of the sequence has been 
sent for mcorporaiion in the EMBL data base. 
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Abstract 

The effects of subceUular localization on single-chain antibody (scFv) expression levels in transgenic 
tobacco was evaluated using an scFv construct of a model antibody.possessing different targeting sig- 
nals. For translocation into the secretory pathway a secretory signal sequence preceded the scFv gene 
(scFv-S). For cytosolic expression the scFv antibody gene lacked such a signal sequence (scFv-C). Also, 
both constructs were provided with the endoplasmic reticulum (ER) retention signal KDEL (scFv-SK 
and scFv-CK, respectively). The expression of the different scFv constructs in transgenic tobacco plants 
was controUed by a CaMV 35S promoter with double enhancer. The scFv-S and scFv-SK antibody 
genes reached expression levels of 0.01 % and 1% of the total soluble protein, respectively. Surprisingly, 
scFv-CK transformants showed considerable expression of up to 0.2% whereas scFv-C transformants 
did not show any accumulation of the scFv antibody, the differences in protein expression levels could 
not be explained by the steady-state levels of the mRNAs. Transient expression assays with leaf pro- 
toplasts confirmed these expression levels observed in transgenic plants, although the expression level 
of the scFv-S construct was higher. Furthermore, these assays showed that both the secretory signal and 
the ER retention signal were recognized in the plant cells. The scFv-CK protein was located mtracel- 
lularly, presumably in the cytosol. The increase in scFv protein stability in the presence of the KDEL 
retention signal is discussed. 

Introduction properties. Antibodies and antibody fragments 

can be used to engineer disease resistance, to alter 
Recent advances in antibody engineering offer or design metabolic routes with catalytic anti- 
various perspectives to endow plants with new bodies, and to study plant growth and develop- 
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ment by antisense-like approaches [32]. For these 
applications it is crucial to have functional anti- 
bodies located in the proper subcellular comparl- 
ment. This can be accomplished by providing the 
antibody with suitable targeting and sorting sig- 
nals [1]. 

The engineering of , antibodies is facilitated by 
their domain structure. The domains carrying the 
antigen-binding loops can be manipulated in dif- 
ferent ways to create various biologically active 
fragments [42]. An interesting and valuable anti- 
body fragment is the single-chain antibody (scFv), 
in which the variable domains of light and heavy 
chain are connected by a flexible peptide linker. 
Through expression of scFvs, several problems 
inherent to the post-translational processing of 
complete antibodies, such as assembly of the four 
subunits, formation of intermolecular disulphide 
bonds and glycosylation, can be circumvented 
[15,17]. 

Single-chain antibodies have been successfully 
expressed in plants. Constitutive cytosolic. expres- 
sion of an scFv antibody in tobacco mediated 
resistance against artichoke mottled crinkle virus 
[36]. Owen et al. [27] and Firek et al [11] re- 
ported c>^osolic expression and secretion of an 
anti-phytochrome scFv antibody. 

Cytosolic expression of functional scFv anti- 
bodies in plants and other eukaryotes [2, 3, 41] 
is remarkable. The two intramolecular disulphide 
bridges (one in and one in Vl) which are 
assumed to be necessary for folding into a stable 
and functional scFv [ 14] are expected not to be 
formed in the reducing environment of the cyto- 
sol because of the absence of the enzyme protein 
disulphide isomerase [13], which catalyses the 
formation of such bonds. 

' Despite the reported successes, intracellular 
expression of scFv antibodies in plants may not 
be that straightforward. Owen et al [27] reported 
that only after screening more than 100 transgenic 
plants, transformed with 'leaderiess' scFv con- 
structs, a plant showing an expression level of 
0.1% of the total soluble protein fraction was 
obtained, while transformants expressing the 
secretory version of the scFv gene produced ten 
times more scFv protein [11]. 



The objective of our study was to compare 
functional expression of scFv proteins in trans- 
genic tobacco plants if targeted to different sub- 
cellular compartments. The scFv gene was de- 
rived from the heavy and light chain genes of an 
antibody raised against a cutinase (21C5) of 
Botrytis cinerea [29]. Both with and without sig- 
nal peptide the expression of this scFv gene greatly 
improved when the C-terminal endoplasmic 
reticulum (ER) retention sign ar peptide, KDEL 
[28], was added. Possible causes for this strong 
enhancement of expression and the implications 
for antibody expression in plants are discussed. 

Materials and methods 

Bacterial vectors and strains 

For cloning of the scFv inserts the bacterial ex- 
pression vector pHENl [19] was modified by 
substituting the multiple cloning site and deleting 
the g3p gene (pNEM5). Addition of the KDEL 
(Lys-Asp-Glu-Leu) coding sequence behind the 
c-myc tag sequence resulted in pNEM5K. The 
Escherichia coli strains DH5a and TGI were used 
for routine cloning and scFv protein expression, 
respectively. 



Plant vectors 

The vectors pCP033, pCP033T and pCP035 
were used for plant transformations and transient 
assays, ..These vectors are closely related to 
pCP05 [ 12] and only differ between the T-DNA 
borders (Fig. 1). The vector pCP033 contains a 
promoter-terminator cassette composed of a 
truncated cauUflower mosaic virus (CaMV) Cabb 
B-D 35S promote (-343/- 1) with duplicated 
enhancer sequence ( -343/ - 90) together with the 
38 bp alfalfa mosaic virus (AIMV) RNA4 un- 
translated leader [33], a polylinker with unique 
Ncol, Sstl, Smal and BglU cloning sites, and the 
nopaline synthase terminator, respectively. Fur- 
thermore, the )?-lactamase gene for prokaryotic 
selection (ampicillin in E. coli, carbenicillin in * 
Agrobacterium tumefaciens) and the APH(3')II 
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resistance gene. 



gene under the control of the nopaline synthase 
promoter for kanamycin resistance selection at 
the plant level were located between the T-DNA 
borders. pCP033T contains a c-myc tag sequence 
[25] between the multiple cloning site and nopa- 
line synthase terminator. pCPOSS contains a 
mouse kappa ligjit chain signal sequence as Ncol- 
Sa\l fragment between the 35S promoter, and the 
Kpnl site. The mouse signal sequence for ER 
translocation is derived from the kappa light 
chain, CEA 66E3 [21, 37],. and was chosen be- 
I cause minor changes could create a Sail site, 
which is rare in antibody genes [5, 21]. The sig- 
nal sequence was made synthetically with an Ncol 
site at the 5' end (triplet position -24) and a Sail 
site at the 3' end (triplet position -3). 

Isolation, ampliation and cloning of antibody 21 C5 
variable domains 



Isolation of po^A)"*^ RNA from 21C5 hybri- 
doma cells [29] was performed by using the 
QuickPrep Micro mRNA purification kit (Phar- 
macia). First strand cDNA was synthesized using 
the Pharmacia First Strand cDNA Kit. The var- 
iable heavy (Vh) and light domains (Vl) of the 
2 1 C5 antibody were amplified through PGR using 
the following primers: 5'-end primer (H53) 5'-GGT 
CTCGAGTGTGAGGtCCAGCTGCAACAA- 



TCTG-3' and 3 '-end primer (VH33) 5'-ATGC- 
GTTAACCCCGGGTGTTGTTTTGGCTGM- 
RGAGACDGTGAS-3' for the heavy chain, and 
5'-end primer (L5d) V -GGT GTCGACG GT- 
GATGTTKTGATGACCCAAA-3' and 3 '-end 
primer (VKl) 5'-GGCTCGAGTTTGGATT- 
CGGAGCCGGATCCTGAGGATTTACCC- 
TCCCGTTTTATTTCCAGSTTGGTSCCY- 
CC-3' for the light chain. Primers L5d and H53 
contained a SaK and \Y/7oI site at their 5 '-end, 
respectively. Primer VH33 was chosen such that, 
after PGR amplification and digestion with Smal, 
the Vh domain still contained the initial five 
triplets of the GHl domain, encoding Ala-Lys- 
Thr-Thr-Pro. The primer VKl carried an Xhol 
site at the 5' -end. Primer VKl also encodes a 
sequence for a synthetic linker peptide, adapted 
with some modifications from Ghaudhary et al. 
[5]. For amplification first strand cDNA was 
denatured at 94 ^G for 4 min and subjected to 35 
cycles of PGR using Vent DNA Polymerase 
(New England Biolabs). Each PGR cycle con- 
sisted of denaturation at 94 °G for 1 min, 
anneaUng at 60 °G for 2 min, and primer 
extension at 72 °G for 3 min. The amphfied 
fragments were purified from agarose gel, digested 
with the ' appropriate restriction enzymes, and 
Hgated simultaneously into 5a/I/SmaI-digested 
pNEM5 and pNEM5K, resulting in the vectors 
pNEM-scFv and pNEM-scFv-^K, respectively. 
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The nucleotide sequences of the scFv inserts 
were verified by the dideoxy chain termination 
sequencing method [31] on an A.L.F. DNA se- 
quencer (Pharmacia). The sequence encoding the 
21C5 scFv was subjected to computer analysis 
with the Wisconsin GCG software package [9]. 
From the derived protein sequence the molecular 
weight was calculated and the algorithm for pre- 
dictmg processing sites for eukaryotic signal se- 
quences was used [39]. 



Bacterial expression of scFv cassettes 

E. coli strain HB 2151 was transformed with 
pNEM-scFv and pNEM-scFv-K. For the scFv 
expression assay 5 ml 2 x TY, 1 % (w/v) glucose 
and 100 /ig/ml ampicUlin, was inoculated with a 
colony containing the appropriate plasmid, and 
incubated at 30 °C for 16 h. Fresh medium con- 
taining 2 xTY, 0.075% (w/v).glucose and 1 /ig/ 
ml ampiciUin was inoculated with 1/50 volume of 
the bacterial culture and incubated at 30 °C for 
3 h. Then the scFv synthesis was induced by 
addmg isopropyl j5-D-thiogalactoside (IPTG) to 
a final concentration of 1 mM and the incubation 
was continued for another 4 h. The periplasmic 
protems were extracted by osmotic shock [26] 
Borate buffer was added to the periplasmic frac- 
tion to a final concentration of 0.2 M sodium 
borate, pH 8.0, and 0.16 M NaCl. The scFvs 
were purified by affinity chromatography with ac- 
tivated protein A Sepharose (Pharmacia) to which 
the anti c-myc tag 9E10 monoclonal antibody [25] 
was covalenUy attached. 



Cloning in plant vectors and tobacco transformation 

To generate constructs suitable for cloning in 
plant vectors without the signal sequence die 
pNEM-scFv and pNEM-scFv-K vectors were 
digested with Ncol and the ends were filled in with 
Klenow. After digestion witii Hindi at \hc Sail 
site the fragments were purified and blunt-end- 
hgated resulting in the vectors pNEM-scFv-C and 
pNEM-scFv-CK, respectively. Thus, tiie Ncol 



site was restored providing the ATG start codon 
in the proper reading frame. Furthermore, the 
ATG start codon was placed at position -3 of 
the mature scFv sequence. The constructs lack- 
ing the KDEL sequence were cloned as 
Smal (pNEM-scFv) or NcollSmal (pNEM- 
scFv-C) fragments , into the NcollSmal digested 
plant vector pCP033T. For construction of the 
scFv-S the Ncol/ Sail signal sequence fragment 
. was also included In the ligation mixture. The 
resulting vectors, pCPO-scFv-S and pCPO- 
scFv-C, had the single chain construct in frame 
with tile c-myc tag sequence. The scFv-K and 
scFv-CK constructs were cloned as Sall/Bcll 
(pNEM-scFv-K) and Ncol/Bdl (pNEM-scFv- 
CK) fragments and transferred to the Sail/ 
Bgai-digested pCP035 and A^coI/5^/II-digested 
pCP033, respectively. The resulting vectors were 
designated as pCPO-scFv-SK and pCPO-scFv- 
CK. All vector-scFv junctions were verified by 
sequencing. 

Tobacco transformation was conducted ac- 
cording to van Engelen et al. [37]. 

Protoplasts 

Transient expression assays in tobacco (N 
tabacum cv. Samsun NN) leaf protoplasts were 
performed according to the polyethylene glycol 
procedure as described by Denecke et al. [7]. The 
same protoplasts isolation and culture method 
was employed to study secretion and retention in 
ti-ansgenic plants. Protoplasts and culture me- 
dium Avere separated by centrifugation and analy- 
sed by western blotting experiments and ELISA. 
For western analysis the proteins present in the 
culture medium were precipitated with 3 volumes 
of ethanol. Both protein pellet and protoplasts 
were dissolved in SDS-PAGE sample buffer (see: 
protein extraction and analysis). For ELISA the 
culture medium was diluted 1:1 with PBS, 0.1°/ 
Tween, 1% skimmed milk powder and 1 mM 
4-(2-aminoetiiyl)-benzenesulphonyl fluoride hy- 
drochloride (Pefabloc SC, Boehringer) and the 
protoplasts were lysed in tiie same buffer. All 
samples were fiirther treated as described in pro- 
tein extraction and analysis. 



RNA extraction and analysis 

For extracting total RNA from plant tissues the 
guanidine hydrochloride procedure of Logemann 
etal. [23] was used. The RNA concentration was 
measured spectrophotometrically and northern 
analysis was carried out according to Sambrook 
era/. [30]. 

Briefly, 9 /xg RNA was separated on a 1.2% 
(w/v) agarose (Pharmacia) formaldehyde gel. As 
size marker 1 ng denatured 21C5 scFv DNA and 
1 -^g of the 0.16-1.77 kb RNA ladder (Life Tech- 
nologies) were used. After electrophoresis the gel 
was incubated twice for 15 min in DEPC-treated 
double distilled water and the RNA was trans- 
ferred to a Hybohd-N + membrane (Amersham) 
by vacuum blotting, using 20 x. SSC as transfer 
buflfer, and cross-linked to the membrane under 
UV light at 1.5 J/cm^. The blot was hybridized 
with [a-^^P]dATP-labeled probes at 65 °C for 48 
h and further treated as described by Church and 
Gilbert ['6]. The stringency of the final washing 
was 0.2 X SSC at 65 ^C. The blot was first hy- 
bridized with a labeled scFv DNA fragment, iso- 
lated as SaR'Smal fragment from pNem-scFv. 
To estabhsh the differences in the amount of total 
RNA the blot was hybridized with a ribosomal 
probe. To estimate the molecular sizes the blot 
was hybridized to labeled cDNA of the RNA 
ladder. All probes were obtained by random prime 
labeling [10]. 



Protein extraction and^nalysis 

Proteins were extracted by grinding tobacco 
leaves in liquid nitrogen to a fine powder. The 
powder was transferred to an eppendorf tube and 
mixed 1:1 (w/v) with SDS-PAGE sample buffer, 
containing 61 mM Tris-HCl pH 6.8, 2% (w/v) 
SDS, 12.5% (w/v) glycerol, 1 mM Pefabloc SC 
(Boehringer). Insoluble plant material was pel- 
leted by centrifugation for 5 min at 13000 rpm. 
The protein concentration in the supernatant was 
determined using the BCA method [34]. To the 
supernatant DTT and bromophenolblue were 
added to final concentrations of 40 mM and 



785 

0.008 % (w/v), respectively, and the samples were 
boiled at 100 °C for 5 min. For non-reducing gel 
electrophoresis DTT was omitted during sample 
preparation. Thirty fig of total protein was loaded 
on a 13% SDS-po.lyacrylamide gel [22] (BioRad 
mini protean system). After electrophoresis the 
proteins were transferred to a PVDF membrane 
(MiUipore) by electroblotting. For immunodetec- 
tion the membranes were incubated with 1:1000 
diluted 9E10 monoclonal antibody, followed by a 
1:5000 diluted rat-anti-mouse alkaline phos- 
phatase conjugate (Jackson Immuno Research). 
Alternatively, a rabbit polyclonal anti-21C5 
serum, precleared from antibodies reacting to the 
constant domains, was used in conjunction with 
a 1:2500 diluted goat-anti-rabbit alkaline phos- 
phate conjugate (Jackson Immuno Research). 
The blots were stained in 0.1 M ethanolamine- 
HCl pH 9.6, supplemented with 4 mM MgCls, 
5-bromo-4-chloro-3-indolyl phosphate (0.06 mg/ 
ml) and nitroblue tetrazolium (0.1 mg/ml). The 
relative molecular weights of the proteins were 
estimated v^th pre-stained low-range molecular 
weight markers (BioRad). 

Purification of native scFv 21C5 antibody from 
plant extracts was carried out by polytron ho- 
mogenization of 4 g tobacco leaves in 4 ml 0.2 M 
sodium borate pH 8.0, containing 0.16 M NaCl 
and 1 mM Pefabloc SC (Boehringer), in the pres- 
ence of 200 mg insoluble polyvinylpyrrolidone 
(Serva). The soluble protein fraction was isolated 
by centrifugation and filtered through a 0.45 /xm 
filter (Millipore). The scFvs were purified by 
affinity chromatography with the 9E10 mono- 
clonal antibody coupled to activated protein A 
Sepharose (Pharmacia) as described previously. 

For use in ELISA assays the proteins were 
extracted by grinding 0.2-0.4 g tobacco leaves in 
Uquid nitrogen to a fine powder. The powder was 
transferred to an Eppendorf tube and mixed 1:2 
(w/v) with PBS-0. 1 % (v/v) Tween (PBST) and 1 
mM Pefabloc SC (Boehringer), and incubated on 
ice for 5 min. Insoluble material was removed by 
centrifugation at 13000 x g. The supernatant was 
stored at -80 °C until further use. 

The cutinase binding activity of the crude 
supernatant or purified scFv was determined by 
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ELISA. A 96-well plate was coated overnight at 
4 **C with 1 fig/ml cutinase in 50 mM sodium 
carbonate, pH 9.6 (100 /xl/well). After blocking 
for 30 min with 200 /il PBST-5% skimmed milk 
powder per well the plates were washed and 100 
fil protein extract per well was added. The plate 
was incubated for 2 h. To determine the antigen- 
binding capacity of the scFv antibody prepara- 
tions, the wells were subsequently washed with 
PBST, eluted with SDS-PAGE sample buffer, 
and analysed by immunoblotting under non re- 
ducing conditions. Alternatively, for quantitative 
ELISA, after washing with PBST each well was 
incubated for another 2 h with 100 fil anti c-myc 
t^ antibody 9E10 (1 ng/^l) in PBST-1% 
skimmed milk powder. Then, after washing three 
times with PBST, the wells were incubated for 1 
h with alkaline phosphatase conjugated rat-anti- 
mouse antibody (Jackson Immuno Research), di- 
luted 1 :5000 in PBST-1 % skimmed milk powder. 



Finally the wells were washed five times with 
PBST and 100 /il substrate (0.75 mg/ml p- 
nitrophenylphosphatein 1 M diethanolamine, pH 
9.8) was added and the OD405 was monitored. 
All incubations were carried out at 37 ""C. 



Results 

Construction of the scFv expression cassettes 

An scFv gene was constructed containing the 
variable domains of the 21C5 antibody heavy- 
(Vh) and light-chain (Vl) genes [37] in the 5'- 
VL-linker-VH-3' orientation. The end of the 
region was fused to the c-myc tag coding sequence 
for detection and purification purposes. To en- 
able translocation of the 21C5 scFv to the lumen 
of the ER it was preceded by a murine k light- 
chain signal peptide (iscFv-S; Fig. 2A). This sig- 



SCFV-S 
SCFV-C 
scFV-SK 
ScFV-Ck 



DVVMTQ= 
MDGDWMTQ= 

DWMTQ« 
MDGDWMTQ= 



Ec-scFv QVDGDVVMTQ= 
Ec-scPv-K QVDGDVVMTQ= 



» (VL) ==IKREGKSSGSGSESKLECEV== (VH) = 
= (VL) T=1KREGKSSGSGSESKLECEV== (VH) = 
= <VL) ==IKREGKSSGSGSBSKLECEV== ( VH) s 
= (VL) ==IKREGKSSGS6SSSKLECEV=» (VH) = 

= (VL) = = IKREGKSSGSGSESKLECEV== (VH) = 
= (VL) ==IKREGKSSGSGSESKLECEV== (VH) = 



=AAKTTPGRS EOKLISEEnT.W . 
^AAKTTPGRS EOKLI SRRDT.N 
=AAKTTPG;U^AEQigiISEEDLNDIKDEL 
^AAKTTPGAAAESJIOilSEEDLNDIKDEL 

=AAKTTPGAA AEOKLISEP.nT.w nT 
=AAKTTPGAAASQi2iISEEDI^IKDEL 



B 




V idO' ™ ac,d ^^quence of the different mature scFv antibody constructs expressed in plants and bacteria TTie 

Id^f tir Sd^ ^ 16 anuno acid Hnk^^^^^^ (bold). The c-n^yc tag (underUned) is Lated at the C-tennL 

antilSdv S^^l H 1 a^^^ty of scFv antibodies, isolated from E, coli, assayed by ELISA using the anti c-myc 

SS.I r1 ofpur^^ Ec-scFv and EcscFv-K antibodies were incubated in wells coated with 100 ng cutLase 

Individual points represent mean values of triplicate trials with standard deviations (error bars) 
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nal peptide has shown previously to export full- 
size antibodies efficiently to the plant apoplast 
[37]. To retain the scFv-S antibody in the ER a 
C-terminal ER retention signal KDEL was added 
(scFv-SK; Fig. 2A). In addition two cytosolic 
versions of the 21C5 scFv (scFv-C and scFv-CK; 
Fig. 2A) were constructed, which both lacked the 
ER translocation signal. 

To determine if the presence of the KDEL re- 
tention signal had any effect on either antigen- 
binding capacity or detection with the anti c-myc 
tag antibody, the scFv genes, with and without 
KDEL sequence, were expressed in E. coli (Ec- 
scFv-K and Ec-scFv, respectively; Fig. 2A). Both 
scFv genes were preceded by the pelB signal pep- 
tide. After affinity purification the Ec-scFv and 
Ec-scFv-K antibodies showed similar binding 
properties to the cutinase antigen in an ELISA 
assay (Fig. 2B). Western blotting followed by im- 
munodetection using the anti c-myc tag antibody 
revealed proteins of 31 kDa (Fig. 3). Apparently, 
addition of the KDEL retention signal had no 
effect on the binding properties of the anti c-myc 
tag antibody 9E10. 

Expression ofscFv antibodies in transgenic tobacco 
leaves 

The scFv cassettes were introduced into tobacco 
by Agrobacterium mediated transformation. As a 
control, transformation also was conducted using 
the empty vector pCP033T. Independent kana- 
mycin-resistant transformants were screened by 
immunoblotting of total protein extracts from 
leaves. All 43 plants containing the scFv-S con- 
structs showed poor expression. By comparison 
of the staining intensity on western blot of the 
plant produced scFv protein with known amounts 
of bacterially produced scFv, it was estimated 
that the maximum expression level reached was 
0.01% of total protein. No scFv protein was de- 
tected in 23 plants containing the scFv-C con- 
struct. However, 9 out of 15 scFv-CK transfor- 
mants showed scFv antibody expression with a 
maximum level of 0.2% (Fig. 3). Of the 15 
scFv-SK transformants 13 were expressing scFv 



(kD) ab a b ab_a bO<^ 
49.5- 



27.5- V 



Fig. 3. Western' blot analysis of leaf tissue from two indepen- 
dent tobacco transformants (a,and b) containing the scFv-S, 
scFv-C, scFv-sk and scFv-CK cassettes. The lanes marked 
'Control' are from a transgenic tobacco plant transformed 
with the vector pCP033T. The scFv antibodies were detected 
using the anti c-myc antibody (9E10), The arrows marked 
indicate the'scFv-S antibody and the arrow marked 'II' indi- 
cates the 65 kDa protein band. 

protein, the highest expression level being 1.0% 
(Fig. 3). 

In plants expressing the scFv-SK protein an 
additional minor product of ca. 65 kDa was de- 
tected. To gain more insight into the nature of this 
65 kDa band, protein samples were prepared from 
leaves and analysed under non-reducing condi- 
tions. Western blotting showed that under these 
circumstances the fraction of the 65 kDa protein 
increased considerably for both scFv-SK and 
scFv-CK protein preparations (Fig. 4A). This 
could indicate that in plant cells the cysteine resi- 
due present in the linker peptide (Fig. 2A) may 
have been involved in dimer forniation. To deter- 
mine if both scFv protein and the presumed scFv- 
dimer had antigen-binding capabilities, purified 
scFv-CK and scFv-SK antibodies were incu- 
bated with immobilized cutinase and analysed 
£ifter elution under non-reducing conditions 
(Fig. 4B). Purified bacterially expressed Ec- 
scFv-K was used as a control. Western blotting 
of the eluents showed that not only scFv-CK and 
scFv-SK monomers, but also the 65 kDa proteins 
bound specifically to the cutinase antigen. 
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Ftg. 4. A. Western blot of proteins from scFv-SK and 
scFv^K transgenic tobacco after non-reducing electrophore- 
sis. The scFv antibodies were detected using the anti c-myc 
antibody (9E10). Arrow indicates the 65 kDa protein band. B 
Binding of scFv antibodies to cutinase. Immunoblot of pro- 
teins ejuted from wells coated with ( + ) or without ( - ) cuti- 
nase after incubaUon with scFv-SK and scFv-CK antibodies 
purified from plants and the Ec-scFv-K antibody purified from 
|E^oA The antibodies were detected with the anti-c-myc anti- 
body (9E10). Astwisk and arrow indicate the scFv iitibodies 
and 65 kDa protein bands, respectively. 



Accumulation ofscFv mRNA and protein in trans- 
genic tobacco leaves 

Since the KDEL retention signal is thought to 
function only in the secretory pathway the differ- 
ence m expression level between the scFv-C and 
scFv-CK was a surprise. Therefore, we investi- 
gated whether the differences in protein accumu- 
lation between the various constructs could be 
explained by differences in the steady state mRNA • 
levels. For a number of plants both total RNA 
and protem was isolated from the same leaf and 
analysed (Figs. 3 and 5). 

Northern blot analysis showed that the scFv 
transgenic plants accumulated scFv mRNA of 
the expected size of 1000-1200 bases, albeit in 
different quantities (Fig. 5A). In addition, a much 
less abundant mRNA of 1400 bases was detected 
The ongin of this mRNA is not clear. It was not 
detected in control plants and therefore may be a 
read-through product of the scFv messenger 




Fig 5. RNA and protein analysis ofleaf tissue.from the same 
independent tobacco transfonnants as depicted in Fig 3 A 
RNA blot containing total RNA hybridized with a specific 
scFv probe. Arrow indicates position of scFv transcripts B 
ScFv anubodies detected on western blot using the polyclonal 
rabbit antiserum against the 2 1C5 antibody. 



RNA. The difference in protein expression level 
between the different scFv genes could not be 
explained by various levels of scFv mRNA The 
scFv-C mRNA level (Fig. 5A, lane a) was com- 
parable with the scFv-CK mRNA level (Fig. 5A 
lane b) but no scFy-C protein was present 
whereas scFv-CK protein was detected (Figs. 3 
and 5B, lanes a and b). Furthermore, a low 
scFv-SK mRNA level (Fig. 5A, lane a) resulted 
in a higher scFv protem accumulation than the 
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relative high scFv-S mRNA level (Fig, 5A, lanes 
a and b). 

To exclude the possibility that the c-myc tag 
had been removed by plant proteases, thereby 
affecting our detection procedure, we used both 
anti-tag antibodies (Fig. 3) and an anti-2lC5-Fv 
rabbit polyclonal antiserum (Fig. 5B) for scFv de- 
tection- in a number of transgenic plants. Essen- 
tially the same results were obtained, indicating 
that the presence of the complete scFv antibody 
correlated with the presence of the tag. 

The addition of the KDEL retention signal el- 
evated the steady-state levels of the 21C5 scFv 
antibody, both with and without signal peptide. 

Expression of the scFv antibodies in tobacco leaf 
protoplasts 

The four different mature scFv proteins varied 
slightly in their number of amino acids (Fig. 2 A). 
The calculated sizes of the mature scFv proteins 
ranged from 30 kDa for the scFv-S to 31 kDa for 
the scFv-CK. An uncleaved signal peptide would 
increase the calculated size for the scFv-S and 
scFv-SK antibodies by 2.5 kDa. On western blot 
the protein bands showed only minor size differ- 
ences, the smallest molecule being the scFv-S 
protein (Fig. 3). This might indicate that the sig- 
nal peptides of both scFv-S and scFv-SK pro- 
teins were recognized and cleaved off during 
translocation into the ER. 

To determine whether the signal peptide and 
the KDEL retention signal had the predicted 
eflFects on scFv protein translocation, transient 
expression assays were carried out in tobacco 
protoplasts. Western blot analysis showed differ- 
ences in the location of the scFv proteins (Fig. 6). 
As expected, the scFv-S protein was secreted into 
the incubation medium indicating that the signal 
peptide was indeed functional. The scFv-SK and 
scFv-CK proteins were predominantly found in- 
side the protoplasts. The residual presence of 
scFv-CK and scFv-SK protein in the incubation 
medium was probably due to cell disruption dur- 
ing the assay, since in a control experiment ex- 
pressing a )S-glucoronidase (GUS) construct 




Fig, 6. Western blot analysis of a transient expression assay 
in tobacco protoplasts transformed with plant vectors con- 
taining the scFv-S, scFv-C, scFv-SK and scFv-CK gene cas- 
settes. The 'Control' lane represents the transformation of 
tobacco protoplasts with the vector pCP033T. The scFv anti- 
bodies present in the cells and incubation medium were de- 
tected by the anti c-myc antibody (9E 10). Arrow indicates the 
65 kDa protein band- 
lacking a signal peptide some GUS activity was 
detected in the medium. From the data obtained 
with the scFv-SK expression we concluded that 
the KDEL retention signal was able to retain the 
scFv-SK antibody inside the protoplasts. This 
was confirmed by using protoplasts, prepared 
from the transgenics with a high scFv mRNA 
level, which showed only intracellular accumula- 
tion of the scFv-SK. However, in this case we 
could not detect the scFv-S antibody, neither in 
the protoplasts nor in the medium (results not 
shown). 



Discussion 

Successful applications for scFv antibodies ex- 
pressed in plants, including creating resistance 
against pathogens [36] or altering metabolic 
pathways (e.g. catalytic antibodies), will to a large 
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degree depend on the ability to target the scFvs 
to a particular subcellular compartment and to 
optimize their expression level. Tavladoraki et al.. 
[36] described the successful expression of an 
scFv antibody directed to the cytosoL Firek et al. 
[11] reported a significant increase in the expres- 
sion level of an scFv antibody against phyto- 
chrome when secreted instead of expressed in the 
cytosol [27]. They suggested that this difference 
in expression levels was not the result of a differ- 
ence in subcellular location but was caused by a 
destabilizing effect of the phytochrome on thecy- 
tosolic scFv [11]. 

Since no further data on the expression of scFv 
antibodies in different subcellular compartments 
of plant cells were available we decided to explore 
the possibilities of intracellular targeting of scFv 
antibodies and assess the effect on stability and 
accumulation. To. improve intracellular stabiHty 
we targeted an scFv antibody away from the cy- 
tosol to the potentially more favourable environ- 
ment of the endoplasmic reticulum (ER) by add- 
ing a sign^ peptide and the tetrapeptide KDEL 
[8, 16, 40] (scFv-SK). For comparison, a secre- 
tory version (scFv-S) of this molecule was used, 
as well as two cytosolic counterparts, one with 
and one without the KDEL retention signal 
(scFv-CK and scFv-C, respectively). The expres- 
sion level and localization of this scFv-SK anti- 
body were compared Avith those of the scFv-C, 
scFv-CK and scFv-S antibodies. 

Of the tobacco transformants expressing the 
scFv-SK cassette, 85% showed a high accumu- 
lation of the protein in leaves. In some plants the 
scFv protein comprised up to 1% of the total 
soluble protein. Protoplasts prepared from these 
transgenic plants showed total retention of 
scFv-SK in the cells. This was confirmed by tran- 
sient expression assays in tobacco protoplasts. 
The ScFv-SK antibody was retained intracellu- 
larly while a large proportion of the scFv-S anti- 
body was secreted into the culture medium. These 
results indicated that the signal peptide was func- 
tional. Furthermore, they showed that the KDEL 
retention signal was probably well exposed, re- 
cognized by a salvage receptor [35, 38], thereby 
enabling the scFv antibody to be retained in the 



ER. When compared with the. plants expressing 
the secreted scFv (scFv-S) the retention in the 
ER resuked in a lOO-fold increase in the amount 
of detectable scFv antibody. These high accumu- 
lation levels cannot be explained by differences in 
the mRNA levels. It therefore seems that the high 
level of scFv antibody accumulation is due to its 
strict localization in the ER and consequently is 
protected from proteolytic activity further down 
the secretory pathway, either intra- or extracellu- 
larly. Similar results have beeii obtained with the 
vacuolar protein vicilin, which also accumulated 
to a much higher level when retained in the ER 
[40]. 

Most striking were the differences in expres- 
sion levels obtained with the scFv-C and 
scFv-CK constructs. No transgenic tobacco 
plants could be found with detectable levels of 
scFv-C antibody. In contrast to this finding, 
among the scFv-CK transformants 60% of the 
. plants showed detectable antibody levels. In one 
plant the scFv-CK protein level reached 0.2% of 
total soluble protein, which is comparable with 
previously reported cytosolic expression levels 
[27, 36]. This difference in expression between 
the two constructs (scFv-C and scFv-CK) was 
also found in the transient expression assay. The 
steady state levels of scFv mRNA indicated that 
the difference in protein accumulation most likely 
depended on differences in stability of the protein. 
This phenomenon is not unique for the anti- 
cutinase scFv, since we have recently obtained 
similar results with another scFv antibody (un- 
published results). 

Presently we can only speculate on the factors 
which cause these KDEL correlated differences 
in expression in plants. Assuming that both 
scFv-C and scFv-CK antibodies are located in 
the cytosol, it might be possible that the 
C-terminal extension of the scFv-CK antibody, 
which is in fact six amino acids long (DIKDEL),' 
protects the scFv from C-terminal degradation by 
exo-proteinases. This then would suggest that 
particular exo-proteinases are involved in the 
breakdown'of scFvs. Alternatively, the DIKDEL 
sequence may indirectly protect the scFv from 
proteolytic attack via a KDEL mediated interac- 
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tion with the cytosolic side of the ER salvage 
receptor [35]. 

Another explanation for the observed differ- 
ences could be that expression levels of the scFv 
are correlated to a different subcellular location. 
It has been well documented that the expression 
of normally secreted proteins, particularly those 
with disulphide bridges, in the cytosol of plant 
cells is very low [4, 12, 33]. It is therefore not 
surprising that the scFv-C transformants failed to 
produce detectable amounts of scFv antibodies. 
Protein analysis using the algorithm for predict- 
ing signal peptidase cleavage sites [39] within the 
GCG Wisconsin program revealed that both ma- 
ture scFv-C and scFv-CK proteins did not con- 
tain a signal peptide-like sequence in the amino- 
terminal region. Possibly, the KDEL containing 
scFvs, even when no signal peptides are added, 
are directed away from the cytosol to a more 
favourable location, presumably the ER. The 
presence of substantial amounts of the 65 kDa 
protein in the scFv-CK transgenic plants along 
with its functionality might indicate an ER loca- 
tion. Noteworthy in this respect is the recent sug- 
gestion that scFv antibodies targeted to the 
cytosol of animal cells were actually 'mistranslo- 
cated' to the ER [20]. In addition, alternative 
pathways for secretory proteins, lacking signal 
peptides, have been put forward [24]. 

The very low expression from the scFv-S con- 
struct in transgenic plants and transgenic proto- 
plasts contrasts with the result obtained in tran- 
sient expression experiments where we could 
detect the scFv-S extracellularly. Possibly the 
protoplasts used for the. transient assay were 
physiologically different from the transgenic 
scFv-S protoplasts and produced less proteases 
into the incubation medium. Firek et aL [11] re- 
ported high expression levels in plants when an 
anti-phytochrome scFv antibody was being se- 
creted. This difference in stability between differ- 
ent scFv antibodies is not clear but may depend 
on the amino acid constitution in the variable 
domains of the scFv antibodies or the linkerpep- 
tide. 

Efficient expression of scFy antibodies in dif- 
ferent subcellular sites seems feasible. However, 



it should be kept in mind that successful expres- 
sion of functional scFvs in the cytosol may only 
be found under certain conditions, like an scFv 
amino acid sequence which remains relatively 
stable [36] or at least can be stabilized by the 
presence of the antigen [1, 11]. The C-terminal 
addition of the retention sequence KDEL as a 
contributing factor for scFv stabilization opens 
additional opportunities for expressing scFv anti- 
bodies in plants. 
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[^2] ACP IMMUNOAFFINITY CHROMATOGRAPHY 



YIELDS OF ACPs Retained by the Af^ruEscherichia coli ACP Immunoaffinity 

Column^ 



Source of 
ACP purified by affinity 
chromatography 

E. coli 
Euglena gracilis strain Z 
Euglena gracilis var. bacillaris 





Units of 


Protein 


ACP 


(Mg)" 


activity" 


160 


203 


151 


162 


168 





ACP-dependent 
fatty acid 
synthase activity* 



+ 
+ 
+ 



One unit of ACP is 1 nmol of exchanged per 15 min in the malonyl-CoA-CO. 
exchange reaction The specific activities of the ACPs purified are lower than those 
tZll fr-hly reduced, desalted ACP. This is probably due to the absence of ^ 
If the e?uents ""^ chromatographic procedures and to the high salt concentration 

* ACP is a substrate for the ACP-dependent fatty acid synthase from Euglena gracilis 
There ,s no fatty ac.d biosynthesis in the absence of ACP; therefore, fatty acid biosyt 

thesis IS a sensitive indication of the presence of functional ACP 
NM, not measured. 



phate pH 6.2, 0.5 M NaCl, until the absorbance (280 nm) of the effluent is 
zero. Finally, specifically bound ACP is removed by elution with 0 2 M 
glycine, pH 2.8, 0.5 M NaCl. After elution of antigen, the immunoadsor- 
bent IS equilibrated and stored in 0.01 M potassium phosphate, pH 6 2 0 1 
M NaC| It should be washed with several column volumes of the skme 
mixture every 2 weeks. 

Properties of the Protein Retained on the Immunoaffinity Column 

A nl^l u" P'""^^^' st'-ain Z and variety bacillaris 

ACFs from the immunoaffinity column are shown in Fig. 3. In each case 
upon apphcation of a crude preparation to the column, protein is specifi- 
cally retained and later released under acidic conditions. When an excess 

J'^'^^'r^f "^^^ processed through the column, a similar result is 
obtained. The yield achieved in the immunoaffinity chro-hiatography step 
IS a function of the binding capacity of the immunoadsorbent. The yields 
are identical from one column run to the next (Table) regardless of the 
excessive amount and stage of purity of the ACP applied to the column. 
Discontinuous electrophoresis, in 14% acrylamide gels, of the material 
applied to the affinity column and of the selectively retained protein dem- 
onstrates the extent of purification achieved by the single step (Fig 4) 

The material retained from the crude E. coli ACP preparation shows a 
single major band at /?, 0.85 (Fig. 4B), identical to E. coli ACP purified 
according to Majerus et al^ (Fie. 4rv Thf» 
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FATTY ACID SYNTHESIS 



[23] 



gracilis strain Z 70-95% saturation ammonium sulfate fraction exhibits a 
major band at/?/ 0.30, a lesser band at/?/0.35, and one or two other very 
faint bands (Fig. 4D). Since the immunoafiinity chromatography is done in 
the absence of thiol reagents, it is possible that some of the ACP is present 
as disulfide bridge-linked dimer.^ The elution pattern of E. gracilis var. 
bacillaris ACP from the immunoaflfinity column is biphasic (Fig. 3C), but 
electrophoresis of material in each peak shows a single protein band at Rf 
0.39 (Fig. 4G,H). In all cases, upon neutralization and concentration, the 
retained protein is biologically active in the ACP assays (Table)." 

The small-scale purifications described here illustrate the potential 
usefulness of immunoafiinity chromatography in obtaining ACPs from 
diverse sources. Under the conditions described, stable and reproducible 
results are obtained through more then 25 runs on a single column. 
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[23] Acyi-Acyl Carrier Protein Thioesterase from Saf flower 
By Tom McKeon and Paul K. Stumpf 

The acyl-ACP* thioesterase catalyzes the hydrolysis of acyl-ACP to 
free fatty acid and ACP-SH. 

Acyl-S-ACP + H2O acyl-OH -H ACP-SH 

The thioesterase is of interest because it terminates the set of 
biosynthetic reactions that take place on ACP, a water-soluble and lipid- 
insoluble acyl carrier. Further metabolism of fatty acids appears to occur 
in membrane systems. Thus, acyl-ACP thioesterase may play an impor- 
tant role in regulating the fatty acid composition of plant tissue.^ 

' The abbreviations are ACP, acyl carrier protein; BSA, bovine serum albumin. 

* W. E. Shine, M. Mancha, and P. K. Stumpf, Arch. Biochem. Biophys. 172, 1 10 (1976). 
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ACYL-ACP THIOESTERASE 
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Assay Method 

Principle. The acyl-ACP thioesterase assay involves the measurement 
of labeled fatty acid released from labeled acyl-ACP. The free fatty acids 
are extracted into petroleum ether and counted in a liquid scintillation 
counter. 

Reagents 

Glycine, 0.20 M, pH 9.0 

Bovine serum albumin, 10 mg/ml in water 

["C]Stearoyl-ACP, 10 /a M in 0.02 M potassium phosphate, pH 6.8 

(synthesis described in this volume^ 
Acetic acid, 1 M, in isopropanol with 5 mg/ml each of palmitic and 

stearic acid 

Petroleum ether, reagent grade, saturated with isopropanol- water, 
1 : 1 (v/v) 

Procedure. The reaction mixture in a 13 x 100 mm screw-cap tube 
contains 100 fjA of glycine buffer, 70 fi\ of water, 10 /xl of BSA and 10 /xl of 
thioesterase preparation appropriately diluted. The reaction is started by 
the addition of 10 /u.1 of [^''Cjstearoyl-ACP, and the reaction is stopped 
after 10 min at room temperature (20-23°) by the addition of 0.2 ml of the 1 
M acetic acid reagent. After 10 min, the free fatty acids are extracted with 
two 2-ml portions of the petroleum ether and the extract is counted. 

The assay is linear with respect to time and enzyme concentration up 
to 40% hydrolysis of substrate. One unit of activity is equal to a rate of 
hydrolysis of 1 /amol per minute per milligram of protein. 

Purification 

Acetone Powder Extract. This material is obtained from acetone pow- 
der of saflflower by the method described for stearoyl-ACP desaturase.^ 

Acid Precipitate. The acetone powder extract is cooled on ice and 
acidified to pH 5.2 with glacial acetic acid. After 1 hr, the precipitate is 
centrifuged at 10,000 g for 10 min, and the supernatant is adjusted to pH 
4.3 with acetic acid. After 1 hr, the precipitate is pelleted and resuspended 
in one-half the starting volume of 0.02 M potassium phosphate buffer, pH 
6.8. Insoluble debris is centrifuged out, and the supernatant retains 60% to 
80% of the acyl-ACP thioesterase activity (see the table) and less than 5% 
of the stearoyl-ACP desaturase activity." 

ACP-Sepharose 4-B column. This column is run exactly as described 

' T. McKeon and P. K. Stumpf, this volume [34]. 
* T. McKeon, unpublished data, 1979. 



180 



FATTY ACID SYNTHESIS 



[23] 



Purification of Acyl-ACP Thioesterase 





Total 


Total 


Specific 






Fraction 


protein" 


activity 


activity 


Yield 


Purification 


(mg) 


(mU) 


(mU/mg) 


(%) 


(fold) 


Acetone powder extract 


400 


86 


0.22 






Acid precipitate 


57 


64 


1.12 


74 


5 


ACP-Sepharose 4B 


.13 


23 


170 


27 


770 



" Protein was determined by the method of O. H. Lowry, N. J. Rosebrough, A. L. Farr, 
and R. J. Randall, J. Biol. Chem. 193, 265 (1951), using bovine serum albumin as the 
standard. 



for the purification of stearoyl-ACP desaturase. The thioesterase elutes 
with the 0.30 M phosphate wash, with the early fractions containing pro- 
portionally more thioesterase and the later fractions more of the de- 
saturase.^ 

Purity. As seen in the table, the acyl-ACP thioesterase is purified 770- 
fold by this procedure. The stearoyl-ACP desaturase is present as approx- 
imately 5% of the bulk protein in the purified preparations of the thioes- 
terase.'* 



Properties 

Specificity. Acyl-ACP thioesterase from safflower has a strong prefer- 
ence for oleoyl-ACP as substrate. The preference for substrates under 
routine assay conditions is oleoyl-ACP > stearoyl-ACP > palmitoyl- 
ACP with relative rates of 10: 2 : 1, respectively. The rates of hydrolysis 
o^oleoyl-CoA and stearoyl-CoA are less than 2% of the rate of hydrolysis 
o£ the corresponding acyl-ACP.^ 

Stability. Preparations purified through the ACP-Sepharose 4B column 
step are stable for 3 weeks at 4*^ when maintained in 1 mM DTT.^* 

pH Activity Profile. The thioesterase is half-maximally active at pH 8.5 
and pH 10.0 with optimum activity at pH 9.5. The thioesterase has less 
than 2% maximal activity at pH 6.5 and below, where the stearoyl-ACP 
desaturase is maximally active.'* 
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Preface 



The interactions of fungi with mankind are both beneficial and harmful 
and are deeply rooted in the history of human society and agriculture. Over 
the centuries man has sought to manipulate the growth of fungi to his 
advantage; the methods used though largely empirical have often been 
highly successful. Since the initial development of recombmant DNA 
technology in bacteria in the early 1970s, biology has been undergomg a 
revolution which is spreading to aU organisms, including fungi. This rev- 
olution is marked by the emergence of a new discipline, molecular biology, 
at the interface between biochemistry and genetics. The approach and 
techniques of molecular biology enable us to ask and answer fundamental 
questions about many aspects of fungal biology, and open the way to the 
directed manipulation of fungal metabolism. 

This book arises from a symposium on 'Fungal Molecular Biology' held 
by the British Mycological Society at the University of Nottingham in April 
1990. Altogether, there were 29 main papers presented at the symposium, 
covering a broad range of both fundamental and appUed aspects of fungal 
molecular biology. In considering a book based on the meeting it seemed 
desirable, given the inevitable restrictions on space and cost, to focus on 
one or two areas. The editors decided to highlight the rapid development 
of gene transfer and cloning techniques in fungi and the ways in which 
these are being exploited in species of economic importance either m 
biotechnology or as plant pathogens. The 11 contributions in this volume 
were selected on that basis. 

The relevant methodologies for gene manipulations in fungi are de- 
scribed in the first three chapters. In chapter 1 (Van den Hondel & Punt) 
the development of suitable vectors and gene transfer systems for filamen- 
tous fungi discussed and the wide applicability of these techniques to all 
fungi is clearly established. One point that emerges is that although a basis 
of classical genetics is useful, it is not essential A central feature of this 
new approach to genetic manipulation is the cloning of genes; several 
strategies are available in filamentous fungi and the most applicable m 
each situation can be readily identified (Chapter 2, Turner). To date, the 
technology for introducing vectors into fungal cells has been restricted 
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uptake into protoplasts. Workers manipulating plant and animal cells have 
explored more 'dramatic* procedures as described by Watts & Stacey 
(Chapter 3). 

Not surprisingly, progress in yeast molecular biology has been even 
more rapid than that with filamentous fungi. Several contributions con- 
cerning yeast research were included in the symposium to provide a point 
of reference for possible future developments with the filamentous fungi. 
Advances with Saccharomyces cerevisiae stem, in part, from its importance 
in brewing, where several opportunities for exploitation of recombinant 
strains exist (Chapter 8, Hinchliffe), but mainly from previously estab- 
lished fundamental knowledge of biochemistry, cell biology and genetics 
in this organism. A clear example of building on the latter is the use of 
Saccharomyces as a host for the expression of heterologous proteins 
(Chapter 4, Ogden). Despite the fact that this fungus secretes only a 
limited range of proteins naturally, it can be engineered genetically to 
secrete significant amounts of recombinant proteins. The success with 
Saccharomyces prompted interest in several other yeasts including the 
methylotrophic spedes and several systems are now operational (Chapter 
7, Veale & Sudbery; Chapter 10, Strasser fl/.). 

Industrially, the filamentous fungi are best known as sources of antibio- 
tics, organic acids and enzymes. Several of the genes encoding biosyhthetic 
enzymes for /3 -lactam synthesis have been cloned and manipulated; the 
advances made in this area in Cephalosporium (Acremonium) are con- 
sidered by Skatrud et al. (Chapter 9). Trichoderma species are used 
commercially as the producers of a range of hydrolytic enzymes which are 
secreted into the growth medium. The cellulase system has been investi- 
gated using molecular genetic techniques and this has led not only to 
improvements in cellulase production, but also to the exploitation of this 
fungus as a host for the expression of heterologous proteins (Chapter 5, 
Penttila et al). The Aspergilli are of particular importance in fungal 
molecular biology as they contain both model experimental and indus- 
trially important species. Several of these species are the subject of intense 
study aimed at developing them as hosts for the commercial production 
of mammalian proteins (Chapter 6, Davies). 

The detrimental economic effects of fungi as agents of plant disease are 
of even greater importance than the beneficial role of fungi in biotechnol- 
ogy. Most phytopathogenic fungi are not amenable to study by the classical 
methods of genetics and biochemistry and, as a result, the basic mechan- 
ism of fungal pathogen-plant host interactions are poorly understood. 
However, the approach and techniques of molecular genetics bypass many 
of these difficulties and are transforming knowledge of all aspects of the 
biology of these fungi. Clearly there is along way to go before we under- 
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stand the molecular basis of fungal pathogenicity, but sound foundations 
are being laid as described in the final chapter (Chapter 11, Oliver et al.). 

The editors of this volume are grateful to the British Mycological 
Society for providing the means to organise such a timely and interestmg 
symposium and for supporting the publication of this volume. Generous 
donations towards the costs of the symposium from Eicon Biochemicals, 
Cambridge University Press, Glaxo Group Research, Pfizer, SmithKhne 
Beecham and Xenova are gratefully acknowledged. We wish to thank all 
those who contributed to the meeting and, in particular, the authors of the 
chapters in this volume for their cooperation in preparing the manuscripts 
for this book in as short a time as possible. Finally, special thanks go to 
David Moore and Page Design for their help, guidance and great effi- 
ciency in producing the book. 
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Chapter 1 



Gene transfer systems and vector development for 

filamentous fungi . 

Cees A. M. J. J. van den Hondel & Peter J. Punt 

Filamentous fungi have a number of properties which make them important 
both scientifically and economically. The economic importance can be 
illustrated by the large variety of products that are made by filamentous 
fungi, such as organic acids (e.g. citric acid), antibiotics (e.g. penicillin 
and cephalosporin) and numerous industrial enzymes (e.g. glucoamylase). 
Filamentous fungi are also used as food (mushrooms), food additives (e.g. 
the meat extender *Quom') and condiments (e.g. soy sauce). A severe, 
negative economic influence of filamentous fungi is their detrimental 
effect on crop yield. Plant pathogenic fungi cause annual crop losses of 
billions of pounds. In addition to their economic importance, filamentous 
fungi have interesting biological properties such as a complex life' cycle, 
cell differentiation, highly regulated metabolic pathways and efficient 
secretion of proteins which make them attractive as a model for basic 
biological research of eukaryotic organisms. 

In the pre-recombinant DNA period, physiological, biochemical and 
genetic studies were mainly carried out with Neurospora crassa and 
Aspergillus nidulans. Their haploid genomes, rapid life cycles, simple 
nutrient requirements and well developed genetic systems made them 
attractive model systems. Hence, it stands to reason that after the intro- 
duction of recombinant DNA techniques, systems for molecular genetic 
analysis were first developed in these intensively studied filamentous 
fungi. Thereafter, similar molecular techniques have been extended to 
less amenable species. 

A prerequisite for molecular genetic research in filamentous fungi is 
the availability oiF a gene transfer system comprising a vector containing a 
selectable marker and a transformation procedure for introduction of the 
vector into the fungus. The specific properties of different types of selec- 
tion markers can be used to design vectors for specific genetic 
manipulation strategies necessary for molecular genetic studies. 

Recently, several excellent reviews have been published about transfor- 
mation and genetic engineering of filamentous fungi (Fincham, 1989; 
Timberlake & Marshall, 1989; Goosen, Bos & Van den Broek, 1990). 

— • 
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Table 1.1. Overview of transformation systems used for filamentous fungi 



Mycelial treatment 
Protoplasts 

CaCl2/PEG 

liposomes 

electroporation 

Intact cells 

Li acetate 

biolistic 



References 

Peberdy (1989) and references therein 
Radford ef a/. (1981) 

Ward ef a/. (1989); Thomas & Kenerly (1989); 
Goldman, Van Montagu & Herrera-Estrella (1990) 

Fincham (1989) and references therein; 
Bej&Periin (1989) 
Armaleoefa/. (1990) 



of the gene transfer systems developed. Special attention will be given to 
some applications of these systems for genetic manipulation in Aspergillus. 

Gene transfer systems 

For genetic manipulation of filamentous fungi a gene transfer system is 
required that permits introduction of exogenous DNA and selection of 
those cells that have incorporated this DNA. This selection can be 
achieved by covalently linking the DNA to a vector which contains a 
selection marker. Both transformation frequency and type of transfor- 
mant can be manipulated by using different types of vector. 

Transformation procedure 

The procedure to obtain DNA-mediated transformed fungal cells com- 
prises the following steps: 

• preparation of cells (protoplasts) which are competent to take up 
(vector) DNA 

• treatment of these cells with the DNA 

• regeneration of colony forming units 

• selection/detection of those cells that have stably incorporated 
DNA. 

A summary of the transformation systems used for filamentous fungi is 
given in Table 1.1. 

Most frequently, protoplasts are used for the introduction of exogenous 
DNA. These protoplasts are obtained by incubation of mycelium or 
spores with cell wall-degrading en^mes in the presence of a compound 
that stabilizes the protoplasts (for an extensive overview of the different 
procedures and enzymes used, see Peberdy (1989)). Recently, transfor- 
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mation through electroporation of protoplasts was described (for refer- 
ences, see Table 1.1). Compared to the generally used CaCh/PEG 
method, no significant improvement of transformation frequency was 
observed. 

A few reports describe the use of intact cells for transformation. Both 
incubation of cells with lithium acetate and particle bombardment 
(chapter 3) have been successfully used for the transformation of filamen- 
tous fungi (for references, see Table 1.1). These methods have the obvious 
advantage that the sometimes laborious protoplast preparation steps can 
be omitted. 

Selection markers 

Three types of selectable marker are used for selection of transformed 
cells: (a) a gene coding for a suppressor tRNA, (b) auxotrophic markers 
and (c) dominant selectable markers. 

To date there is only one example of a suppressor tRNA gene (su-S, 
presumably a mutant tRNA gene) used as selection marker (Brygoo & 
Debuchy, 1985). Although this type of marker potentially can be used m 
each fungal strain which contains a suppressible chain termination muta- 
tion, no additional reports of the application of suppressor tRNA genes 
as selection marker have been published. 

Auxotrophic markers are the most commonly used method for selec- 
tion of transformants. Obviously, a prerequisite for their successful use is 
the presence of the appropriate mutation in the fungus. In Table 1.2 an 
overview is given of the auxotrophic markers which have been used. As 
can be seen from this Table, both homologous and heterologous markers 
can be used for transformation of fungi. 

Some of the markers used (e.g./>yrG, niaD and trpQ have proved to be 
very useful, since they are functional in several species (Table 1.2). Fur- 
thermore, both p3'rG and niaD are attractive markers for developing a 
gene transfer system in genetically poorly characterized fungal species, 
since the required mutants can be isolated by positive selection. In the 
case of pyrG they can be isolated by resistance against 5-fluoro-orotic acid 
(Van Hartingsveldt et al., 1987; Goosen et al, 1987) and in the case of 
niaD by resistance against chlorate (Unkles et a/., 1989). Since it is 
possible to select both for and against the mutant and wild-type pheno- 
types, these markers are also particularly useful for genetic manipulation 
strategies, such as gene-replacement experiments. 

One of the obvious disadvantages of auxotrophic markers is the need 
to isolate a recipient strain with the appropriate mutation. With dominant 
selectable markers both wild-type and mutant strains can be transformed. 
A Kef r.f A^.rr,\r^^r^^ marVp.rs whir.h sfc utiUzcd is civeu lu TaWc 1.3. Several 
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Table 1 .2. Auxotrophic selectable markers used for homologous and/or 
heterologous transformation of filamentous fungi. 



Marker (species)** 


Encoded function 


Transformed 


Reference 




species* 




acuA^ {Ustilago 


acetyl-coA synthase 


Ustilago 


Hargreaves & 


maydis) 




maydis 


Turner (1989) 


acuD^ (/Aspergillus 


isocitrate lyase 


Aspergillus 


Ballance & Turner 


nidulans) 




nidulans 


(1986) 


ade-2'^ 


unknown 


Phanerochaete Kornegay, 


{Schizophyllum 




chrysosporium Pribnow & Gold 


commune) 








am^ {Neurospora 


giutamate 


Neurospora 


Kinsey & 


crassa) 


dehydrogenase 


crassa 


narTlDU5eK \l%7%yfj 


amdS^ (/Aspergillus 


acetamidase 


Aspergillus 


Tilburn etal. 


nidulans) 




niouians 




a^gB* (/Aspergillus 


L-ornithine 


Aspergillus 


John & Peberdy 


nidulans) 


carisamoyl- 








transferase 


















niger 


& Davies f1985) 


Inl^ {Neurospora 


unknown 




AArMI lo Ot 


crassa) 




crassa 


Lambowltz (1985) 


leu^ {Mucor 


unknown 




voii nc\5owij\./f\ oc 


circinelloides) 




circinelloides 


Roncero (1984) 


met^ (Aspergillus 


unknown 


Aspergillus 


iimuraera/. (i9o7) 


oryzae) 




oryzae 




met-2* (/Ascobolus 


homoserine-O-trans 


Ascobolus 


Goyon & 


immersus) 


acetyiase 


immersus 


Faugeron (1989) 


niaD* (/Aspergillus • 


nitrate reductase 


Aspergillus 


Malardlerefa/. 


nidulans) 




niger 


(1989) 






Fusarium 


Malardierefa/. 






oxysporum 


(1989) 


niaD* (/Aspergillus 


nitrate reductase 


Aspergillus 


Unklesefa/. 


niger) 




niger 


(1989) 






Penicillium 


Whitehead ef a/. 






chrysogenum 


(1989) 


niaD* (Aspergillus 


nitrate reductase 


Aspergillus 


Unklesefa/. 


oryzae) 




oryzae 


(1989) 






Aspergillus 


Unkles etal. 






nidulans 


(1989) 
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Table 1.2. continued. 

Marker (species)** Encoded function 

nic-1* {Neurospora unknown 
crassa) 

pkiA* (/^pergillus pyruvate kinase 
nidulans) 



prn* i^pergillus 
nidulans) 

pyr-3* (Ustilago 
maydis) 

pyr-4^ (A/eurospo/a 
crassa) 

pyr-6* {Ustilago 
maydis) 

pyrG* {/Aspergillus 
nidulans) 

pyrG/A^ 

(/Aspergillus niger) 



proline catabolism 

dihydroorotase 

orotidine-5'- 

pliosphate 

decarboxylase 

orotidine-5'- 

phosphate 

decarboxylase 

orotldine-5'- 

phosphate 

decarboxylase 

orotidine-5'- 

phosphate 

decarboxylase 



pyrG^ (/Aspergillus 
oryzae) 



orotidine-5'- 

phosphate 

decartx>xylase 



pyroA^ (Aspergillus 
nidulans) 

qa-2^ [Neurospora 
crassa) 

QUTE^ (/Aspergillus 
nidulans) 

riboB* (/Aspergillus 
nidulans) 

trp-1^ 

{Cochliobolus 
hetern!ttronhus\ 



unknown 

catabollc 
dehydroquinase 

catabollc 
dehydroquinase 

unknown 

trifunctional enzyme 
of tryptophan 
biosvnthesis*** 



Transformed 
species* 

Neurospora 
crassa 

Aspergillus 
nidulans 

Aspergillus 
nidulans 

Ustilago 
maydis 

Aspergillus 
nidulans 



Ustilago 
maydis 

Aspergillus 
nidulans 

Aspergillus 
niger 



Aspergillus 
nidulans 

Aspergillus 
oryzae 

Aspergillus 
niger 

Aspergillus 
nidulans 

Neurospora 
crassa 

Aspergillus 
nidulans 

Aspergillus 
nidulans 

Aspergillus 
nidulans 



Reference 

Akins & 

Lambowitz (1985) 

De Graaff, Van 
den Broek & 
VIsser (1988) 

Durrensefa/. 
(1986) 

Banks & Taylor 
(1988) 

Ballance, Buxton 
& Turner (1983) 

Kronstadefa/. 
(1989) 

Oakley eta/. 
(1987) 

Van Hartingsveldt 
etal. (1987), 
Goosen etal. 
(1987) 

Van Hartingsveldt 
efa/. (1987) 

De Ruiter-Jacobs 
etal. (1989) 

De Ruiter-Jacobs 
etal. (1989) 

May efa/. (1989) 

Case efa/. (1979) 

Da Silvaef a/. 
(1986) 

Oakley efa/. 
(1987) 

Turgeon efa/. 
(1986) 
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Table 1 .2. continued. 

Marker (species)** Encoded function Transformed 

species* 

tryptophan syntliesis Cophnus 

cinereus 



trp-1* {Coprinus 
cinereus) 

trp-1^ 

{Schizophyllum 
commune) 



trifunctional enzyme Schizophyllum 
of tryptoplian commune 
biosynthesis*** 

Coprinus 
cinereus 



trp-1^ {Neurospora trifunctional enzyme Neurospora 
crassa) of tryptophan crassa 

biosynthesis*** 



trp-3^ {Neurospora 
crassa) 

trpC* (Aspergillus 
nidulans) 



trpC* (/^Aspergillus 
niger) 

trpC^ 

{Phanerochaete 
chiysosporlum) 

trpC* {Penicillium 
chiysogenum) 



tryptophan 
synthetase 

trifunctional enzyme 
of tryptophan 
biosynthesis*** 



trifunctional enzyme 
of tryptophan 
biosynthesis*** 

trifunctional enzyme 
of tryptophan 
biosynthesis*** 

trifunctional enzyme 
of tryptophan 
biosynthesis*** 



Neurospora 
crassa 

Aspergillus 
nidulans 

Aspergillus 
niger 

Aspergillus 
nidulans 



Coprinus 
cinereus 



Penicillium 
chrysogenum 

Aspergillus 
nidulans 



ura-5* {Podospora orotidylic acid Podospora 
anserlna) pyrophosphorylase anserlna 



Reference 

Binninger etaL 
(1987) 

Munoz-Rivaseta/. 
(1986) 

Casselton & De La 
Fuenta Herce 
(1989) 

Kim & Marzluf 
(1988) 

Vollmei' & 
Yanofsky (1986) 

Yelton, Hamer & 
Timberlake (1984) 

Gooseneta/. 
(1989) 

Horng, Linz & 
Pestka (1989) 

Casselton & De La 
Fuente Herce 
(1989) 

Scinchezefa/. 
(1987), Picknettef 
a/. (1987) 

Picknettef a/. 
(1987) 

B^gueretef a/. 
(1984) 



* listed here are the first species that have been transformed with the marker 
indicated by homologous or heterologous transformation. In several cases 
other species have subsequently been transformed with the same marker. 



the species from which the marker was isolated is indicated In parentheses. 



*** encodes for glutamine amidotransferase, indoleglycerolphosphate 
synthetase and phosphoribosylanthranilate isomerase. 
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Fig. 1.1. Schematic representation of plasmid pAN8-1, which confers 
phleomycin resistance after transformation (Mattern, Punt & Van den Hondel, 
1 988). Thick line represents >!\. nidulans DNA, punctuated line Streptoalloteichus 
hindustanus DNA, and thin line E. coli DNA; pgpdA. promoter region of the gpdA 
gene; X^pc, terminator region of the trpC gene; ble, phleomycin resistance gene; 
Ap", amplclllin resistance gene. Arrows Indicate the direction of transcription. 



of these markers are 'broad-host range' markers which can be employed 
in different fungal species. All but one of these markers are based on 
drug-resistance. They consist of either mutant fungal genes such as 
benomyl resistant )3 -tubulin (benA, May et al, 1985), or bacterial anti- 
biotic-resistance genes provided with expression signals of filamentous 
fungi. The only exception is the acetamidase gene of ^. nidulans (amdSy 
Kelly & Hynes, 1985), which is a nutritional marker. Transformants 
containing this gene are able to use acetamide or acrylamide as a sole 
nitrogen and carbon source. In general, fungi cannot readily use these 
compounds as such. 

An example of a vector containing a bacterial resistance gene as selec- 
tion marker is shown in Fig. 1.1. In this case the Streptoalloteichus 
hindustanus phleomycin resistance (ble) gene was introduced in a fungal 
expression vector containing the promoter region of the highly expressed 
A. nidulans gpdA gene and the terminator region of the A. nidulans trpC 
gene (Punt et al., 1987). This vector and a similar one, pAN7-l, containing 
the Escherichia coli hygromycin B resistance gene have been used for the 
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Table 1.3. Dominant selectable markers 



Marker* 

amdS (/Aspergillus 
nidulans) 

bar {Streptomyces 
hygroscopicus) 

benA (Aspergillus 
nidulans) 

ble {Escherichia 
coll) 



Encoded function 

acetamldase 

phosphinothrlcin 
acetylase 

benomyl resistant 
/3-tubulln 

phleomycin binding 
protein 



Transformed Reference 
species 

Aspergillus 
niger*'* 

Neurospora 
crassa 



Aspergillus 
nidulans 

Penicillium 
chrysogenum 



Kelly & Hynes 
(1985) 

Avalosefa/. (1989) 
Mayefa/. (1985) 



Kolarefa/. (1988) 



ble phleomycin binding Aspergillus 

{Streptoalloteichus protein nidulans/ 
hindustanus) Aspergillus 

niger** 

Coprinus 
cinereus 



5Ff {Coprinus 
cinereus) 



5-fluoroindole 
(feedback) resist 
ant anthranilate 
synthetase 

3418" {Escherichia geneticin/ 



coli) 

hph {Escherichia 
coli) 

oliC (/Aspergillus 
nidulans) 

oliC (/Aspergillus 
niger) 

oliC {Penicillium 
chrysogenum) 

sul1 {Escherichia 
coll) 



Ustilago 
■>** 



Mattern, Punt & 
Van den Hondel 
(1988) 

D. M. Burrows, T. 
J. Elliott & L A. 
Casseiton 
(unpublished) 

Banks (1983) 



neomycin/kanamycin maydis 
phosphotransferase 

hygromycin B CephalosporiumQueener et al. 
phosphotransferase acremonium** (1985) 

mitochondrial ATP Aspergillus 



synthase subunit 9 

mitochondrial ATP 
Synthase subunit 9 

mitochondrial ATP 
synthase subunit 9 

dihydropteroate 
synthetase 



Ward, Wilkinson & 
Turner (1986) 

WardefaA (1988) 



tub {Colletotrichum benomyl resistant 

graminicola) )S -tubulin 

tub-2 {Neurospora benomyl resistant 

crassa) y5-tubulin 

tubA {Septoria benomyl resistant 

nodorum) /3 -tubulin 



nidulans 

Aspergillus 
niger 

Penicillium Bull, Smith & 
chrysogenum Turner (1988) 

Penicillium Carramolino et at. 
chrysogenum (1989) 

Colletotrichum Panaccione, 



graminicola 

Neurospora 
crassa** 

Septoria' 
nodorum** 



McKierman & 
Hanau (1988) 

Orbach, Porro & 
Yanofsky (1986) 

Cooley & Caten 
(1989) 



* the species from which the marker gene was isolated is indicated In 
parentheses. ** the species listed is the first species transformed with the 
marker. For the other markers transformation of onlv one species has been 
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Types of vector 

In general, vectors used for transformation experiments comprise E. 
coli plasmid DNA and the appropriate selectable marker. In most fiingal 
species vector DNA becomes integrated into the genome of the host after 
transformation. Although considerable effort was undertaken to con- 
struct autonomously replicating vectors for A, nidulans and Neurospora 
crassOy using a strategy similar to that described for Saccharomyces 
cerevisiae (Stinchcomb, Struhl & Davis, 1979), no autonomous replication 
of the vector could be detected (Ballance & Turner, 1985; Buxton & 
Radford, 1984; Paietta & Marzluff, 1985; Van Gorcom, unpublished). In 
one case, however, a DNA sequence (the A. nidulans ansl sequence) 
which considerably enhances the transformation frequency was isolated. 
Nevertheless, even this vector did not replicate autonomously (Ballance 
& Turner, 1985). 

For some other species, autonomously replicating vectors were suc- 
cessfully constructed by adding into an integration vector autonomously 
replicating sequences (ARS) (Ustilago maydis, Tsukuda et al.^ 1988), the 
chromosomal ends of Tetrahymena themtophila^ {Podospora anserina^ 
Perrot, Barreau & Begueret, 1987), or the termini of naturally occurring 
linear plasmids of Nectria haematococca {Ustilago maydis, Samac & 
Leong, 1989). 

In contrast to the results obtained for the ascomycetous fungi, Neuros- 
pora and Aspergillus y in zygomycetous fungi, like Mucorcircinelloides (van 
Heeswijck, 19S6),Phycomycesblakesleeanus (Revuelta & Jayaram, 1986), 
and Absidia glauca (Wostemeyer, Burmester & Weigel, 1987) autono- 
mous replication of vectors was observed in most cases. Autonomous 
replication was also observed for a filamentous yeast species, 
Trichosporon cutaneum (Glumoff et al.y 1989) transformed with pAN7-l 
(see above). 

Fate of transforming DNA 

As already mentioned, in most filamentous fungi vector DNA is inte- 
grated into the genome. Biochemical analysis of the DNA of 
transformants indicates that when a homologous selection marker is used, 
in general three types of integration events can occur: type I, integration 
of the vector by homologous recombination; type II, ectopic integration 
of the vector (or vector sequences) by non-homologous recombination; 
and type III, gene replacement. For most homologous selectable markers, 
predominantly homologous interactions (type I and III integrations) 
occur. However, in some cases type II transformants are preferentially 
found, e.g. inA. nidulans with the amdS gene (Wernars et aL, 1985) or the 
nm ae.rte. rliisfp.r ^Diirrp.ns fit al.. IQRfiV and in Axcnhnlus immerxux with 
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Table 1.4. Fungal species successfully transformed with the vectors 
pAN7-1 and/or pAN8-1 



Transformed SDOCios 


Vector 


Reference 




pAN7-1 


pANo- 1 




Acremonium chrysogenum 


+ 


ND 


A W. Smith, M. 
Ramsden & J. F. 
Peberdy (unpubl.) 


Aspergillus nidulans 


+ 


+ 


Puntefa/. (1987) 


Aspergillus niger 


T 


• 


Puntefa/. (1987) 


Aspergillus ficuum 






Kill ill^nAi/ Pi inf V/^n 
MUlianeyi rUni Ot Vail 

den Hondel (1988) 


Aspergillus oryzae 




• 

+ 


Mattern, Punt & Van 
den Hondel (1988) 


A^Deraillus aiaanteus 


+ 


ND 


Wnendt, Jacobs & 
Stahl (1990) 


Clavlceps purpurea 


+ 


ND 


Comlnoefa/. (1989) 


Cryphonectria parasitica 


+ 


ND 


Churchill ef a/. (1990) 


Cun^ularia lunate 




ND 


Osiewacz & Weber 
(1989) 


Fulvia fulvum 


+ 


+ 


Oliver ef a/. (1987) 


Fusarium culmorum 


+ 


ND 


H. Curragh, R. 
Marchant, H. 
Moolbroek & J. G. H. 
Wessels (unpubl.) 


Leptosphaeria maculans 


+ 


ND 


Farman & Oliver 
(1988) 



predominantly type II transformants are observed when the TRP-1 mar- 
ker is used (Binningeref a/., 1987). Transformation of Ascobolus immersus 
with vector DNA linearised by cutting within the marker sequence or with 
circular single-stranded vector DNA preferentially results in type I inte- 
gration events (Goyon & Faugeron, 1989). 

In the case of heterologous selectable markers integration will always 
occur through non-homologous recombination, seemingly at random sites 
in the genome. 

Genetic manipulation 

The availability of different gene transfer systems with different charac- 
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Table 1 .4. continuea. 








Transformed species 


Vector 


Reference 




pAN7-1 


pAN8-1 




Neurospora crassa 


+ 


ND 


Staben etal. (1989) 


rQnicuuufn cnrysoyxsiiuiii 




+ 


Kolareta/. (1988) 


Penicillium roquefortii 


+ 


Kin 


N. Durand, P. 
Revmond & M. Fevre 
(unpubl.) 


Pseudocercosporella 
herpotrichoides 


+ 


ND 


Blakemoreeta/. 
(1989) 


Schizophyllum commune 


+ 


ND 


Mooibroekefa/. 
(1990) 


Septoria nodorum 


+ 


ND 


Cooleyefa/. (1988) 


Talaromyces emersonii 


ND 


+ 


S. Jain, H. Durand & 
G. Tiraby (unpubl.) 


Trichoderma harzianum 


+ 


ND 


Goldman, Van 
Montagu & 
Herrera-Estrella 
(1990): C. J. Ulhoa, 
M. H. Vainstein & J. F. 
Peberdy (unpubl.) 


Trichodemia hamatum 


+ 


ND 


C. J. Ulhoa, M. H. 
Vainstein & J F 

V will Iwiwii 1 \A W» 1 • 

Peberdy (unpubl.) 


Trichoderma viride 


+ 


ND 


Herrera-Estrella, 
Goldman & Van 
Montagu (1990) 


Trichosporon cutaneum 


+ 


+ 


Glumoffefa/. (1989) 



interesting processes by isolation, characterisation and functional analysis 
of the genes and gene products involved. To perform these studies, 
specific vectors are constructed which facilitate genetic manipulation 
such as cloning of a gene by complementation of a mutation, gene disrup- 
tion or gene replacement, and analysis of expression signals in vivo. 

To illustrate the possibilities of genetic manipulation for molecular 
genetic studies examples will be given of research on Aspergillus that is in 
progress in our laboratory. The first example concerns experiments that 
have been performed to prove that a gene encoding a functional benzoate- 
p-hydroxylase gene of Aspergillus nigerv/as cloned. In the second example, 
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Fig. 1.2. Schematic representation of plasmid pAB8-8, wliicli contains the 
disrupted A. niger bphA gene and the plasmids pAB8-24 and pAB8-25, which 
contain the bphA gene and, respectively, the wildtype or a mutant allele of the 
pyrG gene (Van Gorcom & Van den Hondel, 1988) of A niger respectively. The 
disrupted bphA gene was obtained by replacing an EcoRV segment, located 
within the bphA gene, with the phleomycin resistance unit of pAN8-1. Thick line 
represents A n/ger DNA, punctuated line Streptoalloteichus hindustanus DNA 
and thin line E. coli DNA; ble, phleomycin resistance gene; Ap", ampicillin 
resistance gene; 'bphA', 5'- or 3'-terminal part of {he bphA gene. Arrows indicate 
the direction of transcription. 

analysis of the promoter region of th& Aspergillus genes niaD and 
niiA will be described. The third example deals with a study of the 
influence of different signal sequences on the efficiency of production of 
prochymosin in^. niger. 

Cloning of a Junctional bphA gene ofX, niger 
Benzoate is metabolized by^. niger in a series of steps of which the first 
is /?-hydroxylation of the aromatic ring of benzoate, carried out by 
benzoate-/7-hydroxylase (BPH). Several mutants, disturbed in BPH activ- 
ity, have been isolated (Boschloo & Bos, in preparation). These mutations 
were shown to belong to one complementation group^ therefore the 
mutation was named bplxA. 

A cosmid clone, pAB8-l, containing the putative bphA gene, was 
isolated by differential hybridiaztion techniques. The gene was localized 
on a 6-2 kb EcoRl-Pvull fragment, which was subcloned in pUC19, 
resulting in pAB8-22 (Van Gorcom et al., 1990). Introduction of pAB8-l 
or pAB8-22 DNA into an^. niger bpliA mutant resulted in the restoration 
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bphA 

cliromotonM 



ble 



1 



(A) (B) 

Fig 1 3 Strategies to disrupt the bphA gene. Thin lines represent plasmid DNA 
and tliick line chromosomal DNA. Shaded boxes represent the bphA gene or 
part of the gene. Hatched boxes represent the phleomycin resistance unit. Two 
restriction sites within the bphA gene are indicated with X and Y. (A) Disruption 
of the bphA gene by transformation with a plasmid which contains an internal 
restriction fragment. The recombination event shown results in formation of a 
duplication of bphA with the leftward copy lacking the 3* end of the gene and 
the rightward copy lacking the 5' end. (B) Disruption of the bphA gene by 
transformation with a linear fragment which contains a mutant allele of the bphA 
gene obtained by replacing an internal fragment with the phleomycin resistance 
unit. Recombination between the rightward and leftward homologous regions 
of the DNA fragment and the corresponding chromosomal regions results in a 
gene replacement of the wlldtype gene with the mutant (disrupted) bphA allele. 



of the ability to grow on benzoate, suggesting that the DNA fragment 
contained the bphA gene. 

Although remote, it cannot be completely excluded that a suppressor 
of the bphA mutation had been cloned. One approach to exclude this 
possibility is to disrupt the cloned gene, replace the chromosomal gene 
by the disrupted equivalent and test for the inability to grow on benzoate. 

Two methods regularly used in gene-disruption experiments are indi- 
cated in Fig. 1.3. In both cases the disruption vector contains a 
non-functional copy of the chromosomal gene to be disrupted. The 
method indicated in Fig. 1.3A requires knowledge about the exact posi- 
tion of the gene in the cloned fragment, whereas for the method indicated 

t^inipr cfrain in whir.h the 
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bphA gene was disrupted, the method indicated in Fig. 1.3B was chosen. 
For the disruption experiment, plasmid pAB8-8 was constructed (Fig. 1.2) 
which contains the non-functional bphA gene. In this plasmid part of the 
bphA sequences has been replaced by the phleomycin resistance unit of 
pAN8-l (Mattern, Punt & Van den Hondel, 1988). Transformation of A. 
niger wild type with the isolated EcoRl fragment of pAB8-8 resulted in a 
number of phleomycin resistant colonies. Southern blot analysis revealed 
that in about 10% of the transformants a gene replacement had occurred. 
Further analysis showed that these transformants were not able to grow 
on benzoate as carbon source. This result confirms that the bphA gene 
and not a suppressor gene had been cloned. 

Further evidence for cloning of the benzoate-/7-hydroxylate-encoding 
gene was obtained from the DNA sequence of the bphA gene. Sequence 
comparison showed that the bphA gene encoded a cytochrome P450 
mono-oxygenase, as might be expected. 

Another important issue was the question whether the cloned gene was 
a functional copy of the bphA gene. To answer this question it was 
necessary to prove that the bpJiA mutation was complemented by the 
product of the cloned gene. Therefore an A. niger bph' strain was trans- 
formed with a plasmid containing the cloned gene and transformants were 
isolated in which the plasmid was integrated at an ectopic locus. Growth 
of these transformants on benzoate would indicate that a functional gene 
had been cloned. To achieve ectopic integration, the-.4. niger pyrG selec- 
tion marker was cloned into pAB8-22 resulting in plasmid pAB8-24 (Fig. 
1.2). Van Hartingsveldt et al (1987) previously had found that a vector 
containing this selection marker is integrated at the pyrG locus in about 
50% of^. niger transformants. However, Southern analysis of 48 transfor- 
mants, obtained with pAB8.24, revealed that none of these transformants 
contained a vector integrated at thejcyrG locus. Further analysis indicated 
that in most transformants the vector was integrated at the bphA locus. 

To overcome the problem of preferential integration at the bphA locus, 
a mutant allele of the>4. niger pyrG gene (Van Gorcom & Van den Hondel, 
1988) was cloned in pAB8-22, resulting in pAB8-25 (Fig. 1.2). This mutant 
allele was constructed by introduction of a frameshift mutation which 
inactivates the marker gene. Transformation with the mutant allele as 
selection marker can result in Pyr"*" transformants only through type I or 
type III integration events. Analysis by Southern blotting of transformants 
obtained with pAB8-25 revealed that 14 out of 32 contained a single copy 
of this plasmid integrated at the pyrG locus. These transformants also 
showed a restored ability to grow on benzoate, indicating that, indeed, a 
functional bphA gene had been cloned. As demonstrated by Southern 



Gene transfer systems and vector development 



15 



analysis the other transformants resulted from a gene replacement at the 
pyrG locus. As expected, these transformants could not grow on benzoate. 

Vectors for analysis of expression signals from Aspergillus 

genes 

In both fundamental and applied molecular biological research on 
filamentous fungi the unravelling of the mechanism of gene expression is 
a very important topic. Interesting biological processes, such as develop- 
ment, differentiation and carbon and nitrogen metabolism are regulated 
at the level of gene expression. A wealth of classical genetic information 
is available for these processes, but, until recently, hardly any molecular 
genetic research was carried out. To provide an easy way to assay the 
expression and regulation of various genes, we developed reporter vectors 
for filamentous fungi (Van Gorcom et al, 1986; Van Gorcom & Van den 
Hondel, 1988; Roberts et al., 1989). In these vectors the analysis of fungal 
expression signals can be carried out by fusion of these signals to the E. 
coli reporter genes, lacZ or uidA encoding )3-galactosidase and -glucu- 
ronidase, respectively. The products of these genes can be assayed both 
qualitatively and quantitatively with easy and sensitive methods. For 
proper analysis of expression signals, it is essential that integration of one 
copy of the expression unit can be achieved at a specific location on the 
chromosome of the recipient. To fulfil this requirement, homologous 
selection markers were introduced in these vectors. An even higher 
(relative) frequency of homologous integration could be obtained by using 
mutant selection markers. These mutant selection markers were con- 
structed by introduction of a frameshift mutation which inactivates the 
marker gene. Thus, only intragenic recombination (Type I or III integra- 
tion) between the mutant selection marker on the vector and the mutant 
allele in the genome will result in prototrophic transformants. Although 
the transformation frequency obtained with this type of marker is much 
reduced (about 10-100 fold). Southern analysis of only a few transfor- 
mants is sufficient to identify transformants with a single copy at the locus 
chosen (Table 1.5). Also, linearisation of the vector with a restriction 
enzyme which cuts in the marker gene, increases the relative frequency of 
Type I integration (Table 1.5). 

The promoters of the gpdA genes of both >i. niger and /I. nidulans were 
studied mA. niger mth the use of one of these vectors (Fig. 1.4). Single 
copy transformants, obtained with the two pgpdA-lacZ fusion constructs, 
were assayed for )5-galactosidase activity. In both cases efficient )3-galac- 
tosidase expression was obtained (Table 1.6), whereas in untransformed 
strains or strains transformed with pAB94-12 (vector without promoter 
sequences inserted) no significant )3-galactosidase activity was detected. 
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Table 1.5. Results of Southern analysis of A nidulans transformants 
obtained with pAN5-d1 and derivatives 



Vector^ Transformation 

frequency^ 



pAN5.d1 20-40 
pAN5.d1|B9/iidig«rt) 40-100 
pAN5.dlBgfli 0-1-1 



Percentage of Type of integration^ 
LacZ* 

transformants® 

ABC 
60% 0/19 4/19 15/19 
90% 1/10 5/10 4/10 
40% 5/10 3/10 2/10 



^ Vector pAN5-d1 contains a pgpdA-LacZ fusion and the wildtype argB gene as 
selection marker for Asperg/7/us transformation (Punt etal., 1990). The vector 
contains a unique BglW site In the coding region of the argB gene. Analysis of 
the transformants obtained with pAN5-d1, with a BglW digest of pAN5-d1 and 
with pAN5-dlBg/ii, In which the unique BglW site was filled in with PollK resulting 
in a frame shift mutation in the argB gene (Punt etal., 1990), was carried out. 
Vectors were introduced into A. nidulans ArgB' {methG2, blAI, argB2). 
^ Transformation frequency is given as transformants per ;^g of vector DNA. 
® The percentage of LacZ^ transformants was determined by plating 
transformants on agar plates containing XGal (van Gorcom etal., 1985). In alt 
cases both LacZ"^ and LacZ transformants were observed. The Lacr 
transformants probably arose from gene replacement events. 
^ Southern analysis of a number of LacZ^ transformants was carried out. The 
transformants were classified in three categories; A, single copy integration of 
the vector at the argB locus; B, multiple copy (tandem) integration of vector 
molecules at the argB locus; C, ectopic integration, In some cases In 
combination with single or multiple copy homologous integration. 



A, nidulans and/1. niger are both very efficient in A. niger. Further analysis 
of the organisation of the expression signals of the>l. nidulans g&riG gpdA 
with similar vectors developed for A. nidulans is in progress in our 
laboratory (Punt et al., 1990) . 

Recent research has shown that many fungal genes involved in devel- 
opmental and metabolic pathways are organised as gene clusters (Gurr, 
Unkles & Kinghorn, 1988). Frequently, these clustered genes are co- 
ordinately expressed from divergently transcribing intergenic promoter 
regions. For the analysis of such intergenic regions a twin reporter vector 
was developed (Fig. 1.5). The usefulness of this vector can be inferred 
from the functional analysis of the intergenic region between the >1. 
nidulans nitrate reductase {niaD) and nitrite reductase (/liM) genes. As 
shown in Table 1.7, both nitrate induction and nitrogen metabolite 
(ammonium) repression is observed for the reporter genes. Thus, the 
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Fig. 1.4. Schematic representation of expression analysis vectors pAB94-11 to 
13 for A niger (Van Gorcom & Van den Hondel, 1988). The different vectors 
contain a unique BamHI site in one of the three reading frames In front of the 
lacZ' gene (the protein coding region of the E. coli lacZ gene lacl<ing the first 
eight codons). Thick line represents A. niger DNA (Xba I fragment) and A. 
nidulans DNA. Thin line represents E. coli DNA; ttpc. terminator region of the 
trpC gene; Ap", ampicillin resistance gene; pyrG, mutant allele of the A. niger 
pyrG gene. Arrows indicate the direction of transcription. 



Table 1 .6. /3-Galactosidase expression In A. niger transformants 
containing pgpdA-/acZ fusion genes 



Strain^ PspdA /SC3AL activity^ 

AB4.1[pAB94-53]4 A niger 8570 

6 8380 

7 7770 
AB4-1 lpAB94-1 21 ]4 A nidulans 51 60 

13 5480 

17 5350 

AB4-1 - <10 



^ Vectors pAB94-53 and pAB94-121, derivatives of pAB94-1 1/12/13, containing 
the promoter region of the gpdA gene of A. n/ger and A nidulans, respectively, 
fused to the LacZ gene, were Introduced into A niger AB4-1 {cspA^, pyrG). 
Transformants with a single copy of the vector Integrated at thepyrG locus were 
Identified by Southern analysis. 

^ Enzyme activity is given in units (mg protein)'^ and was measured as described 
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Fig. 1.5. Schematic representation of the twin reporter vector pTRAN2. Thin line 
represent pBR322 DNA; thicit lines, A. nidulans DNA and E. coli DNA (EcoRI 
fragment) which contains the coding region of the E. coli genes lacZ and uidA 
both without translation initiation codon. A unique WofI site is placed between 
these genes; \„iaD, terminator region of the niaD gene; XnHA, terminator region of 
the niiA gene; Ap", ampicillin resistance gene; argB^, mutant allele of the A 
nidulans argB gene containing a frameshift mutation. Arrows indicate the 
direction of transcription. 



expression of the reporter genes lacZ and uidA faithfully represents the 
regulated gene expression of the genes niaD and niiA (Cove, 1979). 

Expression of prochymosin 
Several filamentous fungi are able to produce large amounts of extra- 
cellular proteins. Due to this property, several groups, including ours, are 
carrying out research to evaluate the potential of these strains for the 
production of heterologous proteins. One of the questions we addressed 
in our research on expression and secretion of heterologous, extracellular 
proteins in A, niger is the influence of different signal sequences on the 
efficiency of protein production/secretion. To answer this question ex- 
periments were performed to analyze the production of prochymosin with 
four different gene fusions (van Hartingsveldt et al., 1990). These fusions 
were placed under the control of the expression signals of the^. niger 
glucoamylase (glaA) gene. To facilitate proper comparison, transfor- 
mants containing a single copy of the expression unit integrated at thcgloA 
locus were isolated. For this purpose four different prochymosin express- 
ion vectors, pAB64-72 to pAB64-75 (Fig. 1.6), were used. Transformation 
o^A. niger with //iVidlll-linearised pAB64-72 to 75 resulted in a number 
of hveromvcin B resistant transformants. Southern analysis demonstrated 
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Table 1 .7. Expression of the reporter genes in pTRAN2-1 A transformants 

Relative enzyme activities^ 



Strain^: 


G324 


SAA1012 




^GUS 


/5GAL 


/5GUS 


^GAL 


proline 


1(D 


20 


320 


180 


nitrate+ proline 


100 


100 


490 


260 


nitrate + ammonium 


20 


40 


10 


20 


ammonium 


2 


10 


2 


4 



^ Vector pTRAN2-1A, a derivative of pTRAN2 (Fig. 1.4), containing the A 
nidulans niaD-niiA intergenic promoter region was introduced in A. nidulans 
strains G324 {wAZ, yA2, methH2, argB2, galA^, sC^2, ivoA^) and SAA1012 
{fwA^,yA2,methH2,pabaA^,argB2, nliA-niaDA509). Single copy transformants 
were Identified by Southern analysis. In all cases the/S-glucuronidase O^GUS) 
expression is a result of the activity of the expression signals of the niaD gene, 
and the /S-galactosidase 0GAL) expression results from nh'A gene expression 
signals. 

^ Mycelial extracts were prepared from cells cultivated for 16 to 18 h in minimal 
growth medium with appropriate supplements and 10 mM of the indicated 
nitrogen sources. The enzyme activities were determined as described 
previously (Van Gorcom etal., 1985; Roberts etal., 1989) and are expressed 
relative to the activities of the G324[pTRAN2-1AJ transformants induced with 
nitrate (= 100). In a representative experiment specific activities of 80 nmol 
p-nitrophenol min'^ (mg protein)"' for^SGUS and 310 nmol o-nitrophenol min"* 
(mg protein)"^ for fiGAL were found for the G324[pTRAN2-1AJ transformants 
Induced with nitrate. 

Table 1.8. Analysis of prochymosin production in A niger 

Strain^ Signal peptide Western^ MCA^ 

(ug mr^) (U mr^) 
AB64-72 signal sequence of prochymosin 6' 2 8 -1 6 

AB64-73 signal sequence of g/a4 11-3 19-5 

AB64-74 signal sequence of g/aA 4-1 3-2 

+ 6 additional amino acids 

AB74-75 signal sequence of g/aA 10-2 19-1 

+ 53 additional amino acids 

^ Vectors pAB64-72to 75, linearised by cutting withH/ndlll, were introduced into 
A. niger. Transformants in which the glaA gene is replaced by the prochymosin 
fusion-genes, were identified by Southern hybridization (Van Harlingsveldt ef 
a/., 1990). 

^ Medium samples from cells cultivated for 24 h in induction medium, were 
analyzed for the presence of prochymosin by Western blotting (Western) and 
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that in about 10% of these transformants the resident gloA gene was 
replaced by the expression/secretion unit. Similar results were obtained 
with circular vector DNA, though with a three- to five-fold lower fre- 
quency. 

Transformants which contained one copy of the expression/secretion 
unit were analyzed for prochymosin production 24 h after induction of the 
gloA promoter with starch (Van Hartingsveldt et al., 1990). As shown in 
Table 1.8 similar levels of prochymosin were produced with pAB64-73 (18 
amino acids of gloA) and pAB64-75 (71 amino acids of gloA). With 24 
amino acids oiglaA in front of prochymosin, or with the signal sequence 
of prochymosin itself, a lower production level was observed. 

Although the reasons for the observed differences are obscure, our 
results clearly demonstrate that gene fusions containing different 5' se- 
quences influence the production level of prochymosin. 

Conclusions 

During the last few years the development of gene transfer systems has 
been described for more than fifty fungal species. Transformation of most 
species could be achieved with heterologous auxotrophic markers or 
dominant selectable markers. Usually the marker gene is expressed from 
fungal, mainly y4. nidulans, expression signals which were shown to be 
functional in most fungal species. 

A number of strategies are now available for the development of gene 
transfer systems for hitherto poorly characterized fungal species. As 
illustrated in the first part of this chapter, these strategies comprise the 
following aspects. Firstly, methods for the introduction of vector DNA 
(Table 1.1). Secondly, a large number of auxotrophic and dominant 
selectable markers (Tables 1.2 to 1.4). Thirdly, efficient strategies for the 
isolation of auxotrophic mutant strains. 

The main purpose for the development of gene transfer systems is 
application of these systems for molecular genetic studies. In the second 
part of this chapter several applications were illustrated with examples 
taken from research carried out in our laboratory. Genetic manipulation 
experiments were carried out (a) to disrupt the bpJtA gene oiA. nigen (b) 
to analyze expression signals in A. nidulans and A. niger, (c) to direct 
expression-analysis vectors at specific sites of the genome such as the argB 
locus oiA, nidulans or the pyrG locus oiA. niger and (d) to perform gene 
replacement experiments in which ih^gloA gene of^. niger v/as replaced 
by chimeric prochymosin genes. These examples, as well as others de- 
scribed in the recent literature, indicate that most strategies and tools for 
genetic manipulation in filamentous fungi are now available, especially for 
A. nidulans, A. niger and Neurospora crassa. Extensive molecular genetic 
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studies of many interesting biological processes occurring in filamentous 
fungi can now be carried out using these strategies and tools. 
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I. INTRODUCTION 

The purpose of this section is to provide a brief explanation of fungal tax- 
onomy and a reference table for identifying major taxa. Remember that tax- 
onomic schemes are neither static nor universally accepted. The one presented 
below follows Ainsworth (1971) and Ainsworth et al (1973a,b). Other au- 
thorities may present quite different hierarchies and headings. Nomenclatural 
convention for fungi demands that subdivisions end in "-mycotina," classes in 
"-mycetes," orders in "-ales," and families in "-aceae." Depending on the 
authority and the scheme adopted, you may find the same group accorded differ- 
ential rank. For example, the ascus-producing fungi may be viewed as a class, 
Ascomycetes, or as a subdivision, Ascomycotina. 

If you are interested in exposure to other taxonomic arrangements and in 
learning more about mycology in general, consult one of the comprehensive, 
recent mycology texts such as Burnett (1968), Alexopoulos and Mims (1979), 
Ross (1979), or Moore-Landecker (1982). 
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The kingdom Fungi is divided here into two divisions. The Myxomycota, 
commonly called the ''slime molds," are a varied group of organisms having a 
Plasmodium at some point in their life cycle. One contemporary mycologist 
pointed out that '*the very words slime mold reflect the confusion that has 
surrounded this group of organisms, because they are certainly not molds and 
they are not particularly slimy" (Ross, 1979, p. 178). A number of taxonomic 
questions remain unanswered as to whether the members of the Myxomycota 
really belong with the fungi. 

Members of the division Eumycota, commonly called the "true fungi," usu- 
ally have a filamentous or yeastlike form, and no Plasmodium. Our scheme 
divides the group into five subdivisions. The Mastigomycotina and Zygomycot- 
ina constitute the * Mower fungi"; the Ascomycotina, Basidiomycotina, and Deu- 
teromycotina constitute the "higher fungi." 

The lower fungi are distinguished by hyphae without cross- walls (nonseptate), 
the formation of asexual spores by cleavage of cytoplasm with sporangia, and 
include several groups that possess flagellated zoospores. For many years, the 
lower fungi were grouped together in a single class, the Phycomycetes. Phy- 
comete means "algal fungus" and the name stems from the theory that these 
fungi were degenerate algae that had lost their chlorophyll. The term phy- 
comycete no longer has official taxonomic status, but is still encountered in older 
texts and in works by authors who have not kept up with trends in fungal 
systematics. The classification we present here puts the lower fungi into two 
subdivisions, both of which encompass a diverse and composite group of 
organisms. 

The subdivision Mastigomycotina includes species often identified with ani- 
mals because of the defining characteristic of the group, motile spores. Many of 
these organisms are called water molds because of the prevalent aquatic growth 
habit. 

Zygomycotina contains nonseptate fungi which lack a motile stage and are 
only rarely aquatic. Members of this subdivision exhibit gametangial fusion and 
zygospore formation. 

Taxonomically, the higher fungi are easier to delineate. With the exception of 
the yeasts, they have septate hyphae aiid often produce elaborate fruiting bodies. 
They are divided here into three subdivisions: the Ascomycotina, the Basidio- 
mycotina, and the Deuteromycotina. The Ascomycotina and Basidiomycotina 
are distinguished by their sexual spores; the Deuteromycotina reproduce entirely 
by asexual means. 

The Ascomycotina form ascospores inside a specialized reproductive structure 
called an ascus. Two haploid nuclei fuse within the immature ascus and then the 
diploid fusion nucleus immediately undergoes meiosis, resulting in four haploid 
spores. One mitotic division usually ensues so that most members of the As- 
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comycotina have eight-spored asci. The retention of the products of meiosis 
within a single morphological structure has facilitated many elegant studies on 
chromosomal mechanisms of crossing-over. The three premier species for fungal 
genetics, Aspergillus nidulans, Neurospora crassa, and Saccharomyces cere- 
visiae. are all members of this group. Special features of fungal genetic analysis 
are discussed in detail by Esser and Kuenen, (1967), Burnett (1975), and 
Fincham et al. (1979). 

The Basidiomycotina form sexual basidiospores on a basidium. Basidiospore 
formation closely resembles ascospore development, except that the spores are 
borne externally. Fusion of haploid nuclei results in a transient diploid that 
immediately undergoes meiosis to form four haploid basidiospores. An unusual 
cytological feature of the basidiomycete life cycle is the formation of a special 
binucleate cell called the dikaryon. This subdivision contains the majority of 
conspicuous, macroscopic fungi such as mushrooms, puffballs, and shelf fungi. 
It also contains the important plant pathogens known collectively as rusts and 
smuts. 

The Deuteromycotina, or Fungi Imperfecti, are distinguished by the absence 
of any known sexual form. They reproduce largely by asexual conidiospores. 
Taxonomists consider this an "artificial" group, and often highlight this ar- 
tificiality by using the prefix "form" with reference to the taxa within this 
subdivision (e.g., form-class, form-family, form-genus, form-species). Many 
species originally classified as imperfects are eventually shown to possess a 
sexual stage, usually within the Ascomycotina or, more rarely, within the 
Basidiomycotina. The sexual phase, also called the perfect stage or teleomorph, 
is given a separate name. The rules of botanical nomenclature specify that sexual 
names should have precedence over the asexual (also called imperfect or ana- 
moiphic) names. This creates both practical and philosophical problems. The 
genus names Aspergillus, Penicillium, and Fusarium are all imperfect epithets. 
According to the internationally adopted rules of nomenclature, any time a 
sexual stage is found for a member of one of these genera, the name of that 
species should be changed to that of the sexual form. For example, according to 
these rules, Aspergillus nidulans should be called Emericella nidulans. In prac- 
tice, despite the fact that this species regularly forms ascospores, virtually every- 
one still calls it Aspergillus nidulans. 

Many economically important fungi are classified in the Deuteromycotina. For 
more details about the taxonomy of Aspergillus see Raper and Fennell (1965); for 
Fusarium see Nelson et al. (1981); and for Pencillium see Pitt (1979) and 
Ramirez (1982). The majority of important human pathogens also belong to this 
group; see Rippon (1982). Finally, for a discussion of the issues and problems 
surrounding nomenclatural conventions in the Fungi Imperfecti, see Bennett 
(1985). 
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II. OUTLINE OF FUNGAL TAXONOMY 

Kingdom: Fungi 

Division: Myxomycota (plasmodium or pseudoplasmodium present) 
Class: Acrasiomycetes ("cellular slime molds'') 

Example: Dictyostelium 
Class: Myxomycetes ('*acellular slime molds") 
Example: Physarum 
Division: Eumycota (assirnilative phase typically filamentous or yeastlike) 
Subdivision: Mastigomycotina (nonseptate mycelium, motile spores) 

Examples: Achlya, Allomyces, Blastocladiella, Phythium, 
Phytophthora, Saprolegnia 
Subdivision: Zygomycotina (nonseptate mycelium, zygospores) 

Examples: Absidia, Blakeslea, Mortierella, Mucor, Pil- 
obelus, Rhizopus 

Subdivision: Ascomyeotina (**sac fungi"; septate mycelium or yeast: sexu- 
al spores borne in an ascus) 

Examples: Saccharomyces, Saccharomycopsis (Yarrowia), 

Schizosaccharomyces; Neurospora, Podospora, Sordaria; the 

sexual stages of both Aspergillus and Penicillium; Ascobolus; 

truffles and morels 
Subdivision: Basidiomycotina (*'club fungi"; septate mycelium or yeast; 
sexual spores borne exogenously on a basidium) 

Examples: Puccinia, Ustilago, jelly fungi, rusts, smuts; 

Agaricus, Coprinus, Schizophyllum, mushrooms, puffballs, 

shelf fungi 

Subdivision: Deuteromycotina (the Fungi Imperfecti; septate mycelium or 
yeast; no known sexual phase) 

Examples: Aspergillus, Fusarium, Penicillium; Candida, His- 
toplasma, Wangiella 
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Recommendations for uniform conventions of genetic nomenclature have been 
published for bacteria (Demerec etal. , 1966), Aspergillus nidulans (Clutterbuck, 
1973), Saccharomyces cerevisiae (Sherman, 1981), and Neurospora crassa 
(Perkins et al., 1982). In this volume, we make no attempt to impose a uniform 
standard of genetic symbols, but rather allow our authors to utilize the conven- 
tions of their particular organism and laboratory. 

Although the designations for gene symbols and phenotypes are not the same 
for bacteria, yeasts, and molds, enough similarity exists to mislead the unwary 
reader. Since the publication of the proposals for bacterial genetics by Demerec 
et al. (1966), most primary gene symbols have been designated by three-letter, 
italicized symbols (e.g., arg for a locus affecting arginine biosynthesis). Some 
Neurospora and Aspergillus symbols predate the proposals for standardization of 
genetic nomenclature in bacteria and have fewer or more than three letters. 

The conventions for distinguishing different loci that produce the same phe- 
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notypic change show minor, but confusing, variation from system to system. In 
bacteria and A. nidulans an italicized capital letter immediately follows the three- 
letter symbol {argA, argB, etc.), while in yeast nonhyphenated numbers are used 
{argl, arg2, etc.). In N. crassa, hyphenated numbers are used to distinguish loci 
(arg-l, arg-2, etc.). In yeast, hyphenated numbers designate alleles {argl-37); in 
A. nidulans, unhyphenated numbers designate alleles (argA2) and hyphenated 
numbers designate unmapped mutants {arg-51). 

The conventions for phenotype, dominance, mating-type loci, designation of 
wild type, and other genetic symbols also show subtle differences between the 
systems. The most important recommendations from Clutterbuck (1973), Sher- 
man (1981), and Perkins et al. (1982) are summarized in Sections I-III. Some 
representative examples are given to illustrate each system. Section IV cites a 
few additional systems of fungal genetic nomenclature. See the references for 
more complete explanations of all these nomenclatural conventions. 



I. ASPERGILLUS NIDULANS 

The recommendations for the nomenclature and conventions used for A. 
nidulans follow those of bacterial genetics and are published in Clutterbuck 
(1973). All genetic loci and mutants introduced subsequent to this publication are 
designated by three-letter symbols in italics (e.g., arg). Older symbols, pre- 
viously adopted in the literature, are retained and consist of one to five italic 
letters (e.g., y = yellow; panto = pantothenic acid requirement). Nonallelic loci 
that have the same primary symbols are distinguished by an italic capital letter 
following the symbol, e.g., argA, argB. Alleles are distinguished by italic serial 
mutant numbers after the symbol and locus letter, e.g., argAl, argAl. Where the 
allelic relationships of a mutant have not yet been determined, the capital letter is 
replaced by a hyphenated number (e.g., arg-51). Mitochondrial gene symbols 
are enclosed in square brackets, e.g., [oUAl]. 

Wild-type alleles are indicated by a superscript "plus," e.g., argA-^ . Occa- 
sionally, dominant mutants are designated by capitalizing the first of the three 
letters in a symbol (Acr for acriflavine resistance). In general, dominance is not 
indicated in the primary gene symbol. Symbols for phenotypes are distinguished 
from symbols for genes. Often the phenotype is simply written out in unabbrevi- 
ated fashion (e.g., *'arginine requirement"); alternatively, a nonitalic version of 
the gene symbol with the first letter capitalized is used, e.g., Arg". 

Suppressors used to be designated by complex symbols including the locus 
and/or allele suppressed, e.g., suAladElO, but now simple symbols are encour- 
aged, e.g., suaAl allele-specific, locus-nonspecific suppressor. It is important to 
note that the wild-type, nonsuppressing allele is designated with a symbol 
"plus," as in suaA^ , opposite to the usage for bacteria. 
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Superscripts are used to indicate mutants with specific properties; for instance, 
areA^lS is an areA allele giving derepressed phenotypes for ammonium-re- 
pressed genes, while areA^l gives correspondingly repressed phenotypes. 

The following examples illustrate the conventions used in the genetic nomen- 
clature for A. nidulans: 

argA A specific locus or mutation that produces a require- 

ment for arginine as the phenotype 
argA'^ The wild-type allele 

argA2 A specific allele or mutation in the argA gene 

arg'51 An arginine-requiring mutant not yet tested for allelism, 

whose locus is unknown 
Arg ^ A strain not requiring arginine 
Arg " A strain requiring arginine 

A list of A. nidulans loci is given in Clutterbuck (1974), genetic maps are 
given in Clutterbuck (1984), and the mitochondrial genome is summarized by 
Spooner and Turner (1984). 



II. NEUROSPORA CRASSA 

A summary of conventions, gene symbols, and map locations of N. crassa 
genes is presented in Perkins et al. (1982), following Barratt and Perkins (1965). 
These conventions antedate bacterial genetic nomenclature and more closely 
follow those of Drosophila. Three-letter gene symbols are used most frequently, 
but symbols of one to four letters are also found. Two-letter symbols are quite 
common (e.g., ad, adenine requirement; qa, quinate utilization). Recessive gene 
synibols are written entirely in lowercase italics. When the mutant allele is 
known to be dominant, the first letter is capitalized (e.g., Sk, Spore killer). 

Symbols without superscripts are used to represent mutant alleles. The same 
symbol with a superscript *'plus'' designates the wild-type allele, e.g., ad'^ . 
Alleles differing in resistance or sensitivity, or allelic series having no definitive 
wild type, may be distinguished by other superscripts (e.g., cyclohexi- 
mide resistance; cyh-l^, cycloheximide sensitivity). 

Nonallelic loci are distinguished from one another by numbers, separated from 
the symbol for the locus by a hyphen, e.g., ad-1, ad-2. The use of hyphens to 
distinguish nonallelic gene symbols differs sharply from the conventions for 
bacteria, Aspergillus, and yeast. In Neurospora, the allele number is **not usu- 
ally displayed with the gene symbols, except when necessitated by the use of 
several alleles, when it is included in parentheses after the full locus symbol, e.g. 
pyr-3 (KS43), or when a new mutant gene has not yet been assigned a locus 
number pending tests for allelism with similar genes at previously established 
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loci. In the latter situation, a mutant gene is temporarily designated by an appro- 
priate letter symbol followed immediately by the allele number in parentheses, 
e.g. ilv(STL6)" (Perkins et al., 1982, p. 427). 

Mating-type alleles are called A and a. Suppressors are designated su, fol- 
lowed immediately by the symbol of the suppressed gene in parentheses; non- 
allelic suppressors of the same gene are distinguished by hyphenated numbers 
following the parentheses, e.g;, su(met-7)-l. su(met-7)-2. Following the Dros- 
ophila convention, su^ designates the wild type and su designates the mutant 
suppressor allele. 

The following examples illustrate the niajor conventions used in the genetic 
nomenclature for N. crassa: 

arg Any locus or mutation that produces a requirement for 

arginine as the phenotype 
arg-1 A specific locus that produces a requirement for arginine 

arg-1 + The wild-type allele of the arg-1 gene 

arg-l (JWB7) A specific allele of the arg-1 gene 
arg (JWB22) An arginine-requiring mutant not yet tested for allelism, 

whose locus is unknown 
Arg+ A strain not requiring arginine 

Arg~ A strain requiring arginine 

The Perkins et al. (1982) reference includes a detailed compendium of N. 
crassa \pci and linkage maps. The maps are updated in Perkins (1984) and the 
mitochdhdrial genome is summarized by Collins and Lambowitz (1984). 



ni. SACCHAROMYCES CEREVISIAE 

The recommendations for the nomenclature and conventions used in yeast 
genetics are summarized by Sherman (1981) and Sherman and Lawrence (1974). 
Gene symbols are consistent with the proposals of Demerec et al. (1966), when- 
ever possible, and are designated by three italicized letters, e.g., arg. Contrary to 
the proposals of Demerec et al. (1966), the genetic locus is identified by a 
number (not a letter) following the gene symbol, e.g. , arg2. Dominant alleles are 
denoted by using uppercase italics for all three letters of the gene symbol, e.g., 
A/?G2. Lowercase letters symbolize the recessive allele, e.g., the auxotroph 
arg2. Wild-type genes are designated with a superscript "plus," {sup6-^ or 
A/?G2 + ). Alleles are designated by a number separated from the locus number 
by a hyphen, e.g., arg2-14. Locus numbers are consistent with the original 
assignments; however, allele numbers may be specific to a particular laboratory. 

Phenotypic designations are written out or denoted by cognate symbols, with- 
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out italics, and by the superscripts "plus" and "minus." For example, indepen- 
dence of and requirement for arginine can be symbolized, respectively, as Arg"^ 
and Arg~. 

Gene clusters, complementation groups within a gene, or domains within a 
gene having different properties are designated by capital letters following the 
locus number, e.g., his4A, his4B. (Note that in the conventions of Demerec et 
al, 1966, capital letters following the gene symbol designate different loci.) 

Wild-type and mutant alleles of the mating-type and related loci do not follow 
the standard rules. The two wild-type alleles at the mating-type locus are desig- 
nated MATz and MATa. The two complementation groups of the MATa. locus 
are denoted MATal and MATol. Mutations of the MAT genes are denoted, e.g. , 
mata-l, mata.1-1. The wild-type homothallic alleles at the HMR and HML loci 
are denoted HMRa, HMRa, HMLHy and HMLa. Mutations at these loci are 
denoted, e.g., hmra-1, hmla-1. 

The following examples illustrate the conventions used in the genetic nomen- 
clature for S. cerevisiae: 

ARG2 A locus or dominant allele 

arg2 A locus or recessive allele that produces a requirement for 

arginine as the phenotype 
ARG2 + The wild-type allele of this gene 
arg2-9 A specific allele or mutation at the ARG2 locus 
Arg + A strain not requiring arginine 

Arg~ A strain requiring arginine 

For information on yeast mitochondrial genomes, see Grivell (1984). 

For most structural genes that code for proteins, the nonmutant ("wild-type") 
allele is usually dominant to the mutant form of a gene. In yeast, the convention 
.for dominant, "normal" genes utilizes capitalized italic symbols such as HIS4 
and LEU2. In traditional genetics, we learn about genes through their mutations, 
and linkage maps are created by following mutant alleles in crosses. Published 
linkage data, therefore, consist of gene symbols for the mutant, usually re- 
cessive, alleles [e.g., on linkage group III, his4 and leu2. Those mutant alleles 
that are dominant to their nonmutant, "normal" alleles will appear on linkage 
maps in capital letters (SUP22 and FLDl on chromosome IX)]. In addition, 
capital letters are used to represent dominant wild-type genes that control the 
same character and that are used for mapping iSUC2, SUCl, etc.), as well as 
DNA segments whose locations have been determined by a combination of 
recombinant DNA techniques and classical mapping procedures, e.g., RDNl, 
the segment encoding ribosomal RNA. 

Detailed yeast linkage maps have been published by Mortimer and Schild 
(1980, 1984). 
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IV. OTHER FUNGI 

Genetic conventions in other fungi sometimes follow one of the systems 
outlined above. In the past, workers with "less popular" species tended to 
follow some version of the bacterial-A. nidulans conventions; more recently, the 
yeast system has been gaining in popularity. For example, yeast conventions are 
used for the plant pathogen Cochliobolus heterostrophus (O. Yoder, personal 
communication). Regrettably, many workers adopt idiosyncratic symbols. 

Both the "Handbook of Genetics, Vol. 1, Bacteria, Bacteriophages, and 
Fungi" (King, 1974) and "Genetic Maps 1984" (O'Brien, 1984) contain infor- 
mation about some of the better studied of the less popular fungi. Specific 
references, in alphabetic order by genus, follows: 

Ascobolus immersus (Decaris et al., 1974) 

Dictyostelium discoideum (Newell, 1984) 

Phy corny ces (Cerda-Olmedo, 1974) 

Podospora anserina (Esser, 1974; Marcou et al, 1984) 

Schizosaccharomy ces pombe (Gxxtz et al , 1974) 

Sordaria (Olive, 1974) 

Ustilago maydis (HoUiday, 1974) 

Two species of Basidiomycetes, Coprinus radiatus and Schizophyllum com- 
mune, have been studied intensively, especially with respect to their incom- 
patibility factors. Consult the following references for more information about 
these systems: Raper (1966), Guerdoux (1974), Raper and Hoffman (1974), and 
Schwalb and Miles (1978). 
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532, 534 
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Dicistronic mRNA, 333, 335 
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p-Diphenoloxidase, 347 
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Disomic strain stability, 295, 298 
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divergence, 46 
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hetero-duplex, 182, 183, 184 
homology, 46, 54 
isolation, 70 
ligase, 10 
linear, 42, 176 
mitochondrial, 39, 176 
preparation for cloning, 142 
purity, 38 
reassociation, 44 

relatedness, 41, 44, 149, 183, 184 
renatured. 41, 44 
repair, 41 

replication origins, 171 
sequence analysis, 210 
telomeric, 177 
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Evolutionary relationships, 55 
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Exo-P-glucanases, 55 
Exon, 182 

Exon/intron junctions, 499 
Expression, 1, 151, 180 



uptake, 163 

uptake by yeast cells, 162 
DNA-DNA hybridization, 290 
DNA-DNA reassociation, 46 
DNases, 38 
Dolipores, 52, 54 
Domains, 314, 316 
Dominance, 311, 317 
Dominant markers, 134 
Dose effects, 317, 319 
Doublerstrand gap, 148, 183, 184 
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Drug resistance, 9, 264 
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of genes, 129 

of penB2 locus, 299 



E 

Echinocandin B, 128 
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Effector, 317, 323 
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Emericella nidulans, 533 
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Endomycopsis vernalis, 375 
Endosymbiont hypothesis, 81 
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Enolase promoter, 506 
ENOL 200 
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Enzyme correlates of morphogenesis, 452 
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Episomal, 168 
Eremothecium ashbyii, 378 
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Erythorbic acid, 368 
Escherichia coli, 198, 296, 315 
Ethanol, 382, 383 
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utilization, 331 
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Eiikaryotic cloning system, 404 
Eumycota, 517, 532, 534 
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Fatty acid, 375, 376, 456 
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Filamentous fungi, 280, 281, 291, 345 
Filobasidiella bacillispora, 46 
Filobasidiella neoformans, 46, 471 
Fine-structure mapping, 311, 316 
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Fluoroacetate, 152 
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5-Fluoro-orotic acid, 199 
Flux, 106 

Fragment patterns, 49 
Frameshift mutations, 25 
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Fungi Imperfect i, 533 

of industrial importance, 352 
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taxonomy of, 531 
Fusarium, 371, 503, 534 
Fusarium graminearum, 371 
Fusion protein, 151 
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GDH, 454 
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amplification, 185, 296, 297 
bank, 273 

cluster, 10, 12, 18, 330, 332, 334, 349 

conversion, 169, 182, 183, 185, 213 

disruption, 145, 179, 213, 217 

dosage, 296, 297 

dosage effects, 145, 179 

dosage on penicillin production, 298 
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eviction, 148, 179 

expression, 151, 180 
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function, 53 

fusion, 151, 179 

heterologous, 182, 281, 291 
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tRNA, 71, 74 
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code, 71, 74 
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map, 127, 147, 207 

recombination, 182, 184 
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Genome size, 43, 66 
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Glucose catabolic pathways, 453 
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Glycoprotein, 495, 504, 508 
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5'-GMP, 377 
Gratuitous inducers, 315 
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Growth testing, 312 
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G418, 135, 199, 200, 265 
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Hansenula, 53, 364, 375, 381 
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Hansenula beckii, 54 
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Hansenula mrakii, 66, 69 
Hansenula petersonii, 69 
Hansenula polymorpha, 369 
Hansenula wingei, 42, 46, 48, 53, 54 
Haploid genome, 127 
Helicobasidium mompa, 365 
Hetero-duplex, 182, 183, 184 
Heterokaryon, 138, 152, 311, 319, 362, 478, 
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Heterokaryon formation, 460 
Heterologous expression, 345 
Heterologous genes, 281, 291 
Heterothallic strains, 45 
Hierachy of structural genes, 321 
High-copy-number vectors, 208 
his genes 
his'3, 11 
HIS4, 179, 186 
his7, 207 
Histoplasma, 534 
Histoplasma capsulatum, 473 
Holliday junction, 183, 184 
Homologous genes, 208, 290 
Homologous recombination, 147, 205. 211 
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Homologous transformation, 286 
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enzyme, 9 
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Induction, 325 
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Infection structures, 418 
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Initiator, 330, 335 
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sites, 314 
Inosinic acid, 376 
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Insertion, 317, 350 
Insertional translocation, 334, 335 
intA, 325, 327, 329 

Integration, 65, 168, 169, 170, 179, 205, 265, 

267, 268, 286, 310, 333, 334 
Integrative plasmid (YIp600), 208 
Integrative transformants, 216 
Integrative vectors, 135 
Integrator genes, 309, 313, 325 
Intervening sequence, 232, 498 
Intracellular cAMP, 454 
Intron, 26, 27, 72, 73, 80, 84, 182, 503 
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group JI, 80 
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splicing, 79, 89 
Inversion, 317, 321, 322, 327, 350 
Invertase, 219 
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Ionic strength, 43 
Isoascorbic acid, 368 
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Jelly fiingi, 534 
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Kinetic complexity, 43 
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Kloeckera africana, 69 
Kluyveromyces dobzhanskii, 45 
Kluyveromyces fragilis, 45, 371, 379, 506, 
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Kluyveromyces lactis, 78 
Kluyveromyces marxianus, 45 
Kluyveromyces thermotolerans, 45 
Koji, 49 
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L-Arabinose regulon, 315 

L-Glutamine, 321 

L-rRNA, see Ribosomal RNA 

Labelle plasmid, 246, 410 
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lacZ, 151, 179, 180, 219 
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Leader sequences, 508 

Leaky growth, 312 

leuB gene, 476, 485 

LEU2 gene, 213 

Libraries, 222 

Ligninases, 509 

Linear DNA, see DNA 
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Linkage, 146, 165, 207, 472, 482 

Lipid, 374, 376 

Lipid biosynthesis, 456 

Lipomyces lipofer, 375 

Lipomyces starkeyi, 375 

Liposome-mediated transformation, 280 
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Lysine biosynthesis, 203, 204 
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lys genes 
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208, 209, 211, 214, 215, 216, 217, 
220, 222, 223 
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LYS5, 129, 201, 203 
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Malic acid, 367 
Maltodextrin, 497, 508 
Mannans, 374, 455 
Mannitol, 382 
Marker, 211, 213 
Marker rescue, 221, 269, 270 
Mastigomycotina, 36, 532, 534 
MAT, 215, 216, 217 
Mating, 36, 46 
Mating types, 459 
Maturases, 72, 74, 80, 89 
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Meiotic stability of transformants, 285 
Melanin, 471, 475 
Meselson-Radding model, 183, 184 
Metabolic compartments, 105 
Metabolic versatility, 309, 311 
Metabolism in vivo, 106 
Methionine, 370 
Methylation of EF-la, 458, 459 
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MFal, 215 
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MFal^LYSl, 221 
MFal-SUCl fusion gene, 217 
Microsporuniy 41 
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Miso, 49 

Mitochondria, 39, 176, 282, 291 
ATP synthetase, 264 
biogenesis, 281 

DNA (mtDNA), 271, 272, 410, 474, 484, 
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DNA restriction patterns, 50 
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plasmids, 246, 407 
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mitotic instability, 251 

mitotic stability of transformants, 285 
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Modeling metabolism, 105 
Molecular analysis, 310 
Monomoqjhic mutants, 454 
Monosomic, 207 
Morels, 534 
Mortierella, 534 
Mortierella vinacea, 375 
mRNA, 25, 216, 349 
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Mucon 371, 534 
Mucor bacilliformis, 454 
Mucor circinelloides, 375 
Mucor genevensis, 376, 452, 456 
Mucor mucedo, 459 
Mucor racemosus, 449 
Mucor ramannianus, 376 
Mucor rouxii, 449 
Multicellular form, 474 
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Mutagenesis, in vitro, 150, 179, 184 
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name 
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reducing penicillin yield, 294, 295 
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534 
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288,289,290,291,321,508,534 
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Neurospora transformation, 248 
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Nick translation, 41 

Nicotinate utilization, 316, 331 

Nicotinic acid, 379 

mVA, 314, 315, 318, 319, 331, 334 
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Nitrocellulose filters, 41 
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nitrogen-catabolite repression, 22 

nitrogen-starved growth, 312 
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Novozyme 234, 260 
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nuclear limitation, 317 
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Nutritional screening, 312 
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One-step gene disruption, 150, 214, 215, 216 
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Omithine decarboxylase, 454 
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Parasexual, 294 
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Pathogenesis, 418, 439, 520 
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pDA6200, 205, 206, 207, 208, 209, 210 
pDP2, 283, 285, 286, 288 
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pen, 298 

penAl. 297, 299, 300 
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305 
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Penicillinase, 296 
Penicillium, 503, 534 
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365,371 
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Penicillium cyclopium, 371 
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Penicillium lilacinum, 375 
Penicillium notatum, 366, 368 
Penicillium purpurogenum, 365 
Penicillium varioti, 371 
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pH regulation, 324, 325 
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Pilobolus, 534 
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Plasmid, 42, 83, 84, 86, 87, 141, 200, 222, 
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Labelle, 246, 410 
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Mauriceville, 246 
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segregation, 172 

vectors, 197, 198, 200, 208 
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Pleiotropic, 300 
Podospora, 81, 83, 87, 534 
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430, 542 
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poky, 84, 85 
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Polyamines, 281, 453 
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Polyol, 380, 381, 382 
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Polysaccharides, 372 
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prnA, 313, 329, 331, 332, 333 

prnADBC, 347, 349 
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prnC, 267 
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Product accumulation, 105 
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auxotrophy, 129 

catabolism, 309, 311, 332 

toxicity, 332 
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Promoter, 76, 78, 79, 235, 274, 321, 335, 
498, 503, 507 

fusion, 152, 218 

ga, 235 

region, 505 
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methylation, 458 

polymorphism, 55 

synthesis, 455 
Proton magnetic resonance (PMR), 53 
Protoplast, 36, 260, 472, 477, 478, see also 
Spheroplast 

fusion, 364, 365 

regeneration, 163 
Pseudoconstitutive mutations, 316, 318, 331 
Pseudogene, 81, 322 
Pstl palindromes, 247 
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Pullulan, 372, 373 
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Purine degradation, 312, 315, 331 
pVK55, 248 
pyr gene 

11 

pyr^4, 264, 267, 269. 271. 

pyrF, 476, 485 

pyrG, 264, 267 
Pyridoxine, 379 
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Pyruvate dehydrogenase, 323 
Pyruvic acid, 368 
Pythium, 375 
Pythium ultimum, 68 
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qa gene cluster, 229, 233, 234 
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qutBCE, 347 

R 

rad52, 131, 183, 184, 185 
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Rational approaches for strain development, 
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385, 460, 519, 541 
Recombinant DNA methodology, 223 
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Regulation of gene expression, 13, 309, 310, 

321, 327 
Regulatory circuit, 325 
Regulatory gene, 309, 312, 313, 318, 322, 
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Regulatory gene alcR, 317 

Repair, 149, 183, 184 
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Repetitive DNA, 325, 474 

Replicating vectors, 270 

Replication origins, 77, 78, 271 
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Restriction endonucleases, 49, 50, 210 
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Rhizopus, 368, 371, 375, 497, 534 
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Riboflavin, 378, 379 

Ribosomal DNA, 474 

Ribosomal proteins, 10, 52, 457 

Ribosomal repeat, 267 

Ribosomal RNA, 71, 72, 73, 75, 80, 87, 89 
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Ribosome-binding site, 321 
RNA, 38, 85 
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9 5, 78, 79 
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blot analysis, 333 
polymerase, 77 

ribosomal, 76, 77, 78, 79, 86, 89, see also 

Ribosomal RNA 
L-rRNA, 71, 72, 73, 75, 80, 86, 87, 89 
S-rRNA, 71, 74, 75, 85 
transfer, 7, 71, 74, 77, 78, 86 
Rusts, 534 
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Saccharomyces bailii, 381 

Saccharomyces carlsbergensis, 72 
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199, 205, 206, 208, 223, 281, 282, 297, 
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Saccaromyces diastaticus, 503 
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Single-stranded circular (ssc) DNA, 176 
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Smuts, 534 
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Species, 35, 36, 46 
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Spheroplast, 37, 133, 163, 280, 460, 472, 475 

fusion, 460, 472, 475, 477, 478 

transformation, 163 
Splicing system, 182 
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Sporangiospores, 458 
Sporobolomyces odorus, 371 
Sporothrix schenckii, 48 
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Steady-state model, 105, 106 
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Stopper mutant, 67, 84, 85, 86 
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Strain improvement, 293, 294 
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Structural gene, 312, 315, 325, 327, 333 
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SUC2, 217, 219, 220 

Sugar metabolism, 154 
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SVPll, 111 
SUP4, 217 
Surface structure, 54 
Surrogate genetics, 153 
Systems analysis, 106 
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Tandem duplications, 211, 290 
Tandem repeat, 268 
Tartaric acid, 368 
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Table 7. Production of foreign proteins in noti'Saccharomyces yeasts. 



Yeast 


Protein' 


Location^ 


Promote!* 


Reference 


Pichia pastoris 


p-galactosidase 




AOXl.DHAS 


380 




HBsAg 




AOXI 


81 




Tetanus toxin fragment C 




AOXl 


73 




Pertactin 




AOXI 


306 




TNF 




AOXl 


359 




Streptokinase 




AOXI 


154a 




SOD 




AOXl 


43.375 




HIVgpl20 


IS 


AOXl 


C.A.S., R. Buckholz, 




invertase , 






unpublished results 




S 


AOXl 


381 




Bovine lysozyihe 


s 


AOXl 


97 




Human EOF 


s 


AOXl 


. 43 




Murine EGF 


s 


AOXl 


74 




Aprotinin 


s 


AOXl 


375 




HSA 


s 


AOXl 


K. Sreekrishna, 



Hansemda polymorpha 



p-lactamase 
HBsAg 

PrcSI-S2-HBsAg 
Q-galactosidase 
Glucoamylase 
HSA 

S.c. invertase 



I.S 

PERI 

PERI 

S 

S 

S 

S 



MOXFMD.DAS 

MOX, FMD 

MOX 

MOX 

FMD 

MOX 

MOX 



personal communication 
198 

340.197 

197 

115 

137 

174 

198 



Kluyveromyces lactis 



Prochymosin 

IL-ip 

HSA 

HSA-CX>4 



LAC4 

5.c.PH05.5.cJ»GK 

LAC4.S.C.PH05.5.C.PGK 

5.C.PGK 



391 
119 
120 

K, Fleer, personal 
communication 



Yarrowkt Upolytica 



Sehisosaceharomyces 
pombe 



Sehw.o. a-amylase 


S 


Homologous 


tPA 


S 


7 


TIMP 


S 


? 


^galactosidase 


1 


LEU2 


S.c. invertase 


S 


XPR2 


Bovine prochymosin 


s 


XPR2.LEU2 


Porcine IFN 


s 


XPR2 


Polyoma middle-T Ag 




5.C.PGK 


p-galactosidase 




54/1. fbp, adh, ORE, CaMV35S ' 


CAT 




nmtl, HCGa, CMV. SV40\GRE 


Human epoxide hydrolase 




adh 


Factor Xllla 




adh 


IBD vims VP3 




adh.5.c.ADHI 


£.coli ^-glucuronidase 




CaMV35S' 


Single-chain Ab 




adh 


Bacterio-opsin 


PLM 


adh 


577*7 glucose transporter 


PLM 


adh 


Sx. invertase 


PERI 


Homologous 


S.dia. glucoamylase 


PERI? 


Homologous 


S.C. Q-mannosidase 


CWALL7 


Homologous 


S.C, cxoghKanase 


CWAIX? 


Homologous 


S.e. endochitinase 


CWALL? 


Homologous 


Antitbrombin III 


s 


5.c.ADHl,5.c.CYCl 


Sehw.o. a-amylase 


s 


Homologous 



368 
411 
411 

134 
270 
124 

164.271 

27 

224. 176,287 

249.379,287 

192 

45 

194 

289 

88 

166 

329 

263 

112 

226 

226 

226 

46 

368" 



*S.e., Saccharomyces cerevisiae; Sehw.o., Schwanniomyces occidentatir, S.dia., Saceharomyces diastaticus. 

^Locatioo of expressed protein: I. intracellular, PERI, periplasmic; PLM. plasmamembrane: CWALL, cell wall; S, secreted. 

'Promoten given an native to the organism except: S.c., Saccharomycts cirensiae; \ viral; homologous, homologous to the gene expressed; GRE. glucocorticoid response dements; 
HCGo, human chorionic gooadotrophin a. 
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TABLE 10 

Vacuolar Sorting Signals in the N-Terminal and C-Terminal Propeptides of Plant Vaeuoiar Proteins 



Vacuolar protein Location of propeptide 
Sweet potato sporamin N>tenrdnal 



Barley aleurain 
Bartey lectin 



N-terminal 



C-terminal 



+ H S R F 



Targeting signal 

+ 

PlRtPT--- 



? - + _ + 

+ SSSSFADS N P I R PVTDRAAST- 



- V - Y 

d g Y..F..A .E A I A A N S T Ij. V..^, J? 



Tobacco chitinase A 



C-terminal 



g n G L.L..y,D T M - 



Note, In the amino acid sequence of the propeptide, hydrophobic amino acids are indicated by bold letters, and the positive ar>d negative 
charges in the polypeptides are indicates by and respectively. V indicates the cleavage site of the CTPP. The N-llnked glycan 
(Y) is attached to an Asn residue in the barley lectin CTPP. The exact N-temiinal amino acida of prosporamin and proaleurain in 
tobacco cells are not known. The NPtR motif in the N-termlnal propeptides and the hydrophobic^acldic motif in the CTPPs are 
indicated by the underlines (soHd underlines and broken underlines, respectively). . 

Based on Nakamura. K.. and Matsuoka. K., Plant Physiol., 101 , 1 . 1 993. With pennisslon from the Amernan Society of Rant Phystotogists. 
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